Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
syntax:arrays [2012/04/03 18:51]
ormaaj bit shorter.
syntax:arrays [2013/07/21 02:04] (current)
ormaaj [Bugs] More bugs fixed!!
Line 1: Line 1:
 ====== Arrays ====== ====== Arrays ======
- 
-For completeness and details on several parameter expansion variants, see the [[syntax:​pe|article about parameter expansion]] and check the notes about arrays. 
  
 ===== Purpose ===== ===== Purpose =====
  
-An array is a way for the coder to collect multiple ​values ​(data, text, numbers) under common name, the //​name ​of the array//. The specific values are accessible using an index into the arrayThus, the array (in any programming language) ​is a useful and common data structure.+An array is a parameter that holds mappings from keys to values. Arrays are used to store collection ​of parameters ​into a parameterArrays ​(in any programming language) ​are a useful and common ​composite ​data structure, and one of the most important scripting features in Bash and other shells.
  
-This is **symbolic** (not a real!) view of an array named ''​NAMES''​, the indexes go from 1 to 4 (technically, ​0 to 3) here.+Here is an **abstract** representation ​of an array named ''​NAMES''​. The indexes go from 0 to 3.
 <​code>​ <​code>​
 NAMES NAMES
Line 16: Line 14:
 </​code>​ </​code>​
  
-The purpose is clear: ​If you want the second name, you ask for the index 1 of the array ''​NAME''​. All your names are in the array ''​NAME'',​ you don't need 4 variables for 4 names, just one variable (the array) which contains many //​elements//​.+Instead of using 4 separate variables, multiple related variables are grouped grouped together into //​elements//​ of the array, accessible by their //​key//​. ​If you want the second name, ask for index 1 of the array ''​NAMES''​. ​
  
-__**Attention:​**__ +===== Indexing =====
-  * As in C, the array indexes start at 0 (zero)! +
-  * All Bash arrays are **sparse**! +
-  * Multidimensional arrays are **not implemented**+
  
 +Bash supports two different types of ksh-like one-dimensional arrays. **Multidimensional arrays are not implemented**.
 +  * //Indexed arrays// use positive integer numbers as keys. Indexed arrays are **always sparse**, meaning indexes are not necessarily contiguous. All syntax used for both assigning and dereferencing indexed arrays is an [[syntax:​arith_expr | arithmetic evaluation context]] (see [[#​Referencing]]). As in C and many other languages, the numerical array indexes start at 0 (zero). Indexed arrays are the most common, useful, and portable type. Indexed arrays were first introduced to Bourne-like shells by ksh88. Similar, partially compatible syntax was inherited by many derivatives including Bash. Indexed arrays always carry the ''​-a''​ attribute.
 +  * //​Associative arrays// (sometimes known as a "​hash"​ or "​dict"​) use arbitrary nonempty strings as keys. In other words, associative arrays allow you to look up a value from a table based upon its corresponding string label. **Associative arrays are always unordered**,​ they merely //​associate//​ key-value pairs. If you retrieve multiple values from the array at once, you can't count on them coming out in the same order you put them in. Associative arrays always carry the ''​-A''​ attribute, and unlike indexed arrays, Bash requires that they always be declared explicitly (as indexed arrays are the default, see [[#​Declaration | declaration]]). Associative arrays were first introduced in ksh93, and similar mechanisms were later adopted by Zsh and Bash version 4. These three are currently the only POSIX-compatible shells with any associative array support.
  
-===== Indexing ​=====+===== Syntax ​=====
  
-Bash supports two different indexing methods for arrays: +==== Referencing ====
-  * integer numbers (starts at 0) +
-  * :V4: strings+
  
-The indexing by numbers ​is what was shown aboveEvery element is indexed ​by an integer number, and all syntax used for both assigning and dereferencing indexed arrays is an arithmetic evaluation context.+To accommodate referring to array variables and their individual elements, Bash extends the parameter naming scheme with a subscript suffix. Any valid ordinary scalar parameter name is also a valid array name''<​nowiki>​[[:​alpha:​]_][[:​alnum:​]_]*</​nowiki>''​. The parameter name may be followed ​by an optional subscript enclosed in square brackets to refer to a member of the array.
  
-:V4: The indexing by strings ​is called //associative//​.+The overall syntax ​is ''​arrname[subscript]''​ - where for indexed arrays, ''​subscript''​ is any valid arithmetic expression, and for associative ​arrays, any nonempty string. Subscripts are first processed for parameter and arithmetic expansions, and command and process substitutions. When used within parameter expansions or as an argument to the [[commands/builtin/unset | unset]] builtin, the special subscripts ''​*''​ and ''​@''​ are also accepted which act upon arrays analogously to the way the ''​@''​ and ''​*''​ special parameters act upon the positional parameters. In parsing the subscript, bash ignores any text that follows the closing bracket up to the end of the parameter name.
  
-===== Syntax =====+With few exceptions, names of this form may be used anywhere ordinary parameter names are valid, such as within [[syntax:​arith_expr | arithmetic expressions]],​ [[syntax:pe | parameter expansions]],​ and as arguments to builtins that accept parameter names. An //array// is a Bash parameter that has been given the ''​-a''​ (for indexed) or ''​-A''​ (for associative) //​attributes//​. However, any regular (non-special or positional) parameter may be validly referenced using a subscript, because in most contexts, referring to the zeroth element of an array is synonymous with referring to the array name without a subscript. 
 + 
 +<​code>​ 
 +# "​x"​ is an ordinary non-array parameter. 
 +$ x=hi; printf '%s ' "​$x"​ "​${x[0]}";​ echo "​${_[0]}"​ 
 +hi hi hi 
 +</​code>​ 
 + 
 +The only exceptions to this rule are in a few cases where the array variable'​s name refers to the array as a whole. This is the case for the ''​unset''​ builtin (see [[#​Destruction | destruction]]) and when declaring an array without assigning any values (see [[#​Declaration | declaration]]).
  
 ==== Declaration ==== ==== Declaration ====
Line 44: Line 48:
 |''​ARRAY[0]=''​ |Generally sets the first element of an **indexed** array. If no array ''​ARRAY''​ existed before, it is created. | |''​ARRAY[0]=''​ |Generally sets the first element of an **indexed** array. If no array ''​ARRAY''​ existed before, it is created. |
 |''​declare -a ARRAY''​ |Declares an **indexed** array ''​ARRAY''​. An existing array is not initialized. | |''​declare -a ARRAY''​ |Declares an **indexed** array ''​ARRAY''​. An existing array is not initialized. |
-|''​declare -A ARRAY''​ |:V4: Declares an **associative** array ''​ARRAY''​. This is the one and only way to create associative arrays. |+|''​declare -A ARRAY''​ |Declares an **associative** array ''​ARRAY''​. This is the one and only way to create associative arrays. |
  
 ==== Storing values ==== ==== Storing values ====
Line 51: Line 55:
  
 ^Syntax ^Description ^ ^Syntax ^Description ^
-|''​ARRAY[N]=VALUE''​ |Sets the element ''​N''​ of the **indexed** array ''​ARRAY''​ to ''​VALUE''​. **''​N''​ can be any valid [[syntax:​arith_expr | arithmetic expression]]**. Negative indexes can't be used for assignment even if ''​ARRAY''​ already exists (unlike ksh and zsh, but see "​getting values"​ below for negative indexes in Bash). | +|''​ARRAY[N]=VALUE''​ |Sets the element ''​N''​ of the **indexed** array ''​ARRAY''​ to ''​VALUE''​. **''​N''​ can be any valid [[syntax:​arith_expr | arithmetic expression]]**. | 
-|''​ARRAY[STRING]=VALUE''​ |:V4: Sets the element indexed by ''​STRING''​ of the **associative array** ''​ARRAY''​. |+|''​ARRAY[STRING]=VALUE''​ |Sets the element indexed by ''​STRING''​ of the **associative array** ''​ARRAY''​. |
 |''​ARRAY=VALUE''​ |As above. If no index is given, as a default the zeroth element is set to ''​VALUE''​. Careful, this is even true of associative arrays - there is no error if no key is specified, and the value is assigned to string index "​0"​. | |''​ARRAY=VALUE''​ |As above. If no index is given, as a default the zeroth element is set to ''​VALUE''​. Careful, this is even true of associative arrays - there is no error if no key is specified, and the value is assigned to string index "​0"​. |
-|''​ARRAY=(E1\ E2\ ...)''​ | Compound array assignment - sets the whole array ''​ARRAY''​ to the given list of elements indexed sequentially starting at zero. The array is unset before assignment unless the += operator is used. When the list is empty (''​ARRAY=()''​),​ the array will be set to an empty array. ​:V4: This method obviously does not use explicit indexes. An **associative array** can **not** be set like that! Clearing an associative array using ''​ARRAY=()''​ works. |+|''​ARRAY=(E1\ E2\ ...)''​ | Compound array assignment - sets the whole array ''​ARRAY''​ to the given list of elements indexed sequentially starting at zero. The array is unset before assignment unless the += operator is used. When the list is empty (''​ARRAY=()''​),​ the array will be set to an empty array. This method obviously does not use explicit indexes. An **associative array** can **not** be set like that! Clearing an associative array using ''​ARRAY=()''​ works. |
 |''​ARRAY=([X]=E1\ [Y]=E2\ ...)''​ |Compound assignment for indexed arrays with index-value pairs declared individually (here for example ''​X''​ and ''​Y''​). X and Y are arithmetic expressions. This syntax can be combined with the above - elements declared without an explicitly specified index are assigned sequentially starting at either the last element with an explicit index, or zero. | |''​ARRAY=([X]=E1\ [Y]=E2\ ...)''​ |Compound assignment for indexed arrays with index-value pairs declared individually (here for example ''​X''​ and ''​Y''​). X and Y are arithmetic expressions. This syntax can be combined with the above - elements declared without an explicitly specified index are assigned sequentially starting at either the last element with an explicit index, or zero. |
-|''​ARRAY=([S1]=E1\ [S2]=E2\ ...)''​ |:V4: Individual mass-setting for **associative arrays**. The named indexes (here: ''​S1''​ and ''​S2''​) are strings. |+|''​ARRAY=([S1]=E1\ [S2]=E2\ ...)''​ |Individual mass-setting for **associative arrays**. The named indexes (here: ''​S1''​ and ''​S2''​) are strings. | 
 +|''​ARRAY+=(E1\ E2\ ...)''​ |Append to ARRAY. |
  
 As of now, arrays can't be exported. As of now, arrays can't be exported.
 ==== Getting values ==== ==== Getting values ====
 +<​note>​
 +For completeness and details on several parameter expansion variants, see the [[syntax:​pe|article about parameter expansion]] and check the notes about arrays.
 +</​note>​
  
-^ Syntax ​                                                                        ​^ Description ​                                                                                                                                                                                                                                                                                                                                                          ​+^Syntax ^Description ^ 
-| ''​${ARRAY[N]}'' ​                                                               | Expands to the value of the index ''​N''​ in the **indexed** array ''​ARRAY''​. If ''​N''​ is a negative number, it's treated as the offset from the maximum assigned index (can't be used for assignment) - 1 /:V4: 4.2-alpha) ​                                                                                                                                                                   ​+| ''​${ARRAY[N]}''​ | Expands to the value of the index ''​N''​ in the **indexed** array ''​ARRAY''​. If ''​N''​ is a negative number, it's treated as the offset from the maximum assigned index (can't be used for assignment) - 1  
-| ''​${ARRAY[S]}'' ​                                                               :V4: Expands to the value of the index ''​S''​ in the **associative** array ''​ARRAY''​. ​                                                                                                                                                                                                                                                                                 +| ''​${ARRAY[S]}''​ | Expands to the value of the index ''​S''​ in the **associative** array ''​ARRAY''​. | 
-| ''"​${ARRAY[@]}"​\\ ${ARRAY[@]}\\ "​${ARRAY[*]}"​\\ ${ARRAY[*]}'' ​                 | Similar to [[scripting:​posparams#​mass_usage| mass-expanding positional parameters]],​ this expands to all elements. If unquoted, both subscripts ''​*''​ and ''​@''​ expand to the same result, if quoted, ''​@''​ expands to all elements individually quoted, ''​*''​ expands to all elements quoted as a whole. ​                                                            ​+| ''"​${ARRAY[@]}"​\\ ${ARRAY[@]}\\ "​${ARRAY[*]}"​\\ ${ARRAY[*]}''​ | Similar to [[scripting:​posparams#​mass_usage| mass-expanding positional parameters]],​ this expands to all elements. If unquoted, both subscripts ''​*''​ and ''​@''​ expand to the same result, if quoted, ''​@''​ expands to all elements individually quoted, ''​*''​ expands to all elements quoted as a whole. | 
-| ''"​${ARRAY[@]:​N:​M}"​\\ ${ARRAY[@]:​N:​M}\\ "​${ARRAY[*]:​N:​M}"​\\ ${ARRAY[*]:​N:​M}'' ​ | Similar to what this syntax does for the characters of a single string when doing [[syntax:​pe#​substring_expansion| substring expansion]],​ this expands to ''​M''​ elements starting with element ''​N''​. This way you can mass-expand individual indexes. The rules for quoting and the subscripts ''​*''​ and ''​@''​ are the same as above for the other mass-expansions. ​ | +| ''"​${ARRAY[@]:​N:​M}"​\\ ${ARRAY[@]:​N:​M}\\ "​${ARRAY[*]:​N:​M}"​\\ ${ARRAY[*]:​N:​M}''​ | Similar to what this syntax does for the characters of a single string when doing [[syntax:​pe#​substring_expansion| substring expansion]],​ this expands to ''​M''​ elements starting with element ''​N''​. This way you can mass-expand individual indexes. The rules for quoting and the subscripts ''​*''​ and ''​@''​ are the same as above for the other mass-expansions. |
  
 For clarification:​ When you use the subscripts ''​@''​ or ''​*''​ for mass-expanding,​ then the behaviour is exactly what it is for ''​$@''​ and ''​$*''​ when [[scripting:​posparams#​mass_usage | mass-expanding the positional parameters]]. You should read this article to understand what's going on. For clarification:​ When you use the subscripts ''​@''​ or ''​*''​ for mass-expanding,​ then the behaviour is exactly what it is for ''​$@''​ and ''​$*''​ when [[scripting:​posparams#​mass_usage | mass-expanding the positional parameters]]. You should read this article to understand what's going on.
Line 74: Line 81:
 ^Syntax ^Description ^ ^Syntax ^Description ^
 |''​${#​ARRAY[N]}''​ |Expands to the **length** of an individual array member at index ''​N''​ (**stringlength**) | |''​${#​ARRAY[N]}''​ |Expands to the **length** of an individual array member at index ''​N''​ (**stringlength**) |
-|''​${#​ARRAY[STRING]}''​ |:V4: Expands to the **length** of an individual associative array member at index ''​STRING''​ (**stringlength**) |+|''​${#​ARRAY[STRING]}''​ | Expands to the **length** of an individual associative array member at index ''​STRING''​ (**stringlength**) |
 |''​${#​ARRAY[@]}''​\\ ''​${#​ARRAY[*]}''​|Expands to the **number of elements** in ''​ARRAY''​ | |''​${#​ARRAY[@]}''​\\ ''​${#​ARRAY[*]}''​|Expands to the **number of elements** in ''​ARRAY''​ |
 |''​${!ARRAY[@]}''​\\ ''​${!ARRAY[*]}''​|Expands to the **indexes** in ''​ARRAY''​ since BASH 3.0| |''​${!ARRAY[@]}''​\\ ''​${!ARRAY[*]}''​|Expands to the **indexes** in ''​ARRAY''​ since BASH 3.0|
  
 ==== Destruction ==== ==== Destruction ====
-The ''​unset'' ​builtin command is used to destroy (unset) arrays or individual elements of arrays.+The [[commands/​builtin/​unset | unset]] ​builtin command is used to destroy (unset) arrays or individual elements of arrays.
  
 ^Syntax ^Description ^ ^Syntax ^Description ^
-|''​unset ARRAY''​\\ ''​unset ARRAY[@]''​\\ ''​unset ARRAY[*]''​ |Destroys a complete array | +|''​unset ​-v ARRAY''​\\ ''​unset ​-v ARRAY[@]''​\\ ''​unset ​-v ARRAY[*]''​ |Destroys a complete array | 
-|''​unset ARRAY[N] ''​|Destroys the array element at index ''​N''​ | +|''​unset ​-v ARRAY[N]''​|Destroys the array element at index ''​N''​ | 
-|''​unset ARRAY[STRING] ''​|:V4: Destroys the array element of the associative array at index ''​STRING''​ |+|''​unset ​-v ARRAY[STRING]''​|Destroys the array element of the associative array at index ''​STRING''​ | 
 + 
 +It is best to [[commands/​builtin/​unset#​portability_considerations | explicitly specify -v]] when unsetting variables with unset.
  
 <note warning> <note warning>
Line 97: Line 106:
 Even worse, if ''​nullglob''​ is set, your array/index will disappear. Even worse, if ''​nullglob''​ is set, your array/index will disappear.
  
-To avoid this, either **disable pathname expansion** or **quote** the array name and index:+To avoid this, **always ​quote** the array name and index:
 <​code>​ <​code>​
-unset '​x[1]'​+unset -v '​x[1]'​
 </​code>​ </​code>​
 +
 +This applies generally to all commands which take variable names as arguments. Single quotes preferred.
 </​note>​ </​note>​
  
Line 145: Line 156:
 declare -A sentence declare -A sentence
  
-sentence[Begin]="Be liberal in what " +sentence[Begin]='Be liberal in what' 
-sentence[Middle]="you accept, and conservative ​" +sentence[Middle]='you accept, and conservative' 
-sentence[End]="in what you send" +sentence[End]='in what you send' 
-sentence["Very end"]="..."+sentence['Very end']=...
 </​code>​ </​code>​
  
Line 159: Line 170:
 <​code>​ <​code>​
 for element in Begin Middle End "Very end"; do for element in Begin Middle End "Very end"; do
-  ​printf "​%s"​ "​${sentence["$element"]}"+    ​printf "​%s"​ "​${sentence[$element]}"​
 done done
 printf "​\n"​ printf "​\n"​
Line 184: Line 195:
  
 <​code>​ <​code>​
- ~ $ ( declare -ia '​a=(2+4 [2]=2+2 [a[2]]=a[2])'​ '​a+=(42 [a[4]]+=3)';​ declare -p a )+ ~ $ ( declare -ia '​a=(2+4 [2]=2+2 [a[2]]="a[2]")' '​a+=(42 [a[4]]+=3)';​ declare -p a )
 declare -ai a='​([0]="​6"​ [2]="​4"​ [4]="​7"​ [5]="​42"​)'​ declare -ai a='​([0]="​6"​ [2]="​4"​ [4]="​7"​ [5]="​42"​)'​
 </​code>​ </​code>​
  
-The zeroth element ​is assigned to the result of 2+4. The next element is assigned ​the result of 2+2. The last index in the first assignment is the result of a[2], which has already been assigned as 4, and its value is also given a[2]. This shows that even though any existing arrays named ''​a''​ have already been unset by using ''​=''​ instead of ''​+='',​ arithmetic variables within keys can self-reference any elements already assigned within the same compound-assignment,​ and with integer arrays this is also true of the values so long as they are arithmetic variables, not expansions (beginning with a $ sign). The next assignment argument to declare uses +=, and since this is a compound assignment, it appends to the array rather than deleting it and creating a new array. The next-highest unassigned index is 5, which is set to 42. Lastly, the element whose index is the value of a[4] (which is 4), gets 3 added to it's existing value making a[4] == 7. Note that having the integer attribute set this time causes += to add, rather than append, as it would for a non-integer array.+''​a[0]'' ​is assigned to the result of ''​2+4''​''​a[1]''​ gets the result of ''​2+2''​. The last index in the first assignment is the result of ''​a[2]''​, which has already been assigned as ''​4''​, and its value is also given ''​a[2]''​.
  
-The single quotes force the assignments to be evaluated ​by declare. It's possible to get unexpected results without them (as we would in this case) due to the way in which these compound ​assignment arguments are evaluated. A special-case of this is shown in the next section.+This shows that even though any existing arrays named ''​a''​ in the current scope have already been unset by using ''​=''​ instead of ''​+=''​ to the compound assignment, arithmetic variables within keys can self-reference any elements already assigned within the same compound-assignment. With integer arrays this also applies to expressions to the right of the ''​=''​. (See [[#​evaluation_order | evaluation order]], the right side of an arithmetic assignment is typically evaluated first in Bash.) 
 + 
 +The second compound assignment argument to declare uses ''​+='',​ so it appends after the last element of the existing array rather than deleting it and creating a new array, so ''​a[5]''​ gets ''​42''​. 
 + 
 +Lastly, the element whose index is the value of ''​a[4]''​ (''​4''​),​ gets ''​3''​ added to its existing value, making ''​a[4]''​ == ''​7''​. Note that having the integer attribute set this time causes += to add, rather than append a string, as it would for a non-integer array. 
 + 
 +The single quotes force the assignments to be evaluated ​in the environment of ''​declare'​'. This is important because attributes are only applied ​to the assignment after assignment arguments are processed. Without them the ''​+=''​ compound assignment would have been invalid, and strings would have been inserted into the integer array without evaluating the arithmetic. A special-case of this is shown in the next section. 
 + 
 +<​note>​ 
 +Bash declaration commands are really keywords in disguise. They magically parse arguments to determine whether they are in the form of a valid assignment. If so, they are evaluated as assignments. If not, they are undergo normal argument expansion before being passed to the builtin which evaluates the resulting string as an assignment (somewhat like ''​eval'',​ but there are differences.) '''​Todo:'''​ Discuss this in detail. 
 +</​note>​
  
 ==== Indirection ==== ==== Indirection ====
Line 228: Line 249:
  
 main main
 +</​code>​
 +
 +This script is one way of implementing a crude multidimensional associative array by storing array definitions in an array and referencing them through indirection. The script takes two keys and dynamically calls a function whose name is resolved from the array.
 +<​code>​
 +callFuncs() {
 +    # Set up indirect references as positional parameters to minimize local name collisions.
 +    set -- "​${@:​1:​3}"​ ${2+'​a["​$1"​]'​ "​$1"'​["​$2"​]'​}
 +
 +    # The only way to test for set but null parameters is unfortunately to test each individually.
 +    local x
 +    for x; do
 +        [[ $x ]] || return 0
 +    done
 +
 +    local -A a=(
 +        [foo]='​([r]=f [s]=g [t]=h)'​
 +        [bar]='​([u]=i [v]=j [w]=k)'​
 +        [baz]='​([x]=l [y]=m [z]=n)'​
 +        ) ${4+${a["​$1"​]+"​${1}=${!3}"​}} # For example, if "​$1"​ is "​bar"​ then define a new array: bar=([u]=i [v]=j [w]=k)
 +
 +    ${4+${a["​$1"​]+"​${!4-:​}"​}} # Now just lookup the new array. for inputs: "​bar"​ "​v",​ the function named "​j"​ will be called, which prints "​j"​ to stdout.
 +}
 +
 +main() {
 +    # Define functions named {f..n} which just print their own names.
 +    local fun='​() { echo "​$FUNCNAME";​ }' x
 +
 +    for x in {f..n}; do
 +        eval "​${x}${fun}"​
 +    done
 +
 +    callFuncs "​$@"​
 +}
 +
 +main "​$@"​
 +</​code>​
 +
 +===== Bugs and Portability Considerations =====
 +
 +  * Arrays are not specified by POSIX. One-dimensional indexed arrays are supported using similar syntax and semantics by most Korn-like shells.
 +  * Associative arrays are supported via ''​typeset -A''​ in Bash 4, Zsh, and Ksh93.
 +  * In Ksh93, arrays whose types are not given explicitly are not necessarily indexed. Arrays defined using compound assignments which specify subscripts are associative by default. In Bash, associative arrays can //only// be created by explicitly declaring them as associative,​ otherwise they are always indexed. In addition, ksh93 has several other compound structures whose types can be determined by the compound assignment syntax used to create them.
 +  * In Ksh93, using the ''​=''​ compound assignment operator unsets the array, including any attributes that have been set on the array prior to assignment. In order to preserve attributes, you must use the ''​+=''​ operator. However, declaring an associative array, then attempting an ''​a=(...)''​ style compound assignment without specifying indexes is an error. I can't explain this inconsistency.<​code>​
 + $ ksh -c '​function f { typeset -a a; a=([0]=foo [1]=bar); typeset -p a; }; f' # Attribute is lost, and since subscripts are given, we default to associative.
 +typeset -A a=([0]=foo [1]=bar)
 + $ ksh -c '​function f { typeset -a a; a+=([0]=foo [1]=bar); typeset -p a; }; f' # Now using += gives us the expected results.
 +typeset -a a=(foo bar)
 + $ ksh -c '​function f { typeset -A a; a=(foo bar); typeset -p a; }; f' # On top of that, the reverse does NOT unset the attribute. No idea why.
 + ksh: f: line 1: cannot append index array to associative array a
 +</​code>​
 +  * Only Bash and mksh support compound assignment with mixed explicit subscripts and automatically incrementing subscripts. In ksh93, in order to specify individual subscripts within a compound assignment, all subscripts must be given (or none). Zsh doesn'​t support specifying individual subscripts at all.
 +  * Appending to a compound assignment is a fairly portable way to append elements after the last index of an array. In Bash, this also sets append mode for all individual assignments within the compound assignment, such that if a lower subscript is specified, subsequent elements will be appended to previous values. In ksh93, it causes subscripts to be ignored, forcing appending everything after the last element. (Appending has different meaning due to support for multi-dimensional arrays and nested compound datastructures.) <​code>​
 + $ ksh -c '​function f { typeset -a a; a+=(foo bar baz); a+=([3]=blah [0]=bork [1]=blarg [2]=zooj); typeset -p a; }; f' # ksh93 forces appending to the array, disregarding subscripts
 +typeset -a a=(foo bar baz '​[3]=blah'​ '​[0]=bork'​ '​[1]=blarg'​ '​[2]=zooj'​)
 + $ bash -c '​function f { typeset -a a; a+=(foo bar baz); a+=(blah [0]=bork blarg zooj); typeset -p a; }; f' # Bash applies += to every individual subscript.
 +declare -a a='​([0]="​foobork"​ [1]="​barblarg"​ [2]="​bazzooj"​ [3]="​blah"​)'​
 + $ mksh -c '​function f { typeset -a a; a+=(foo bar baz); a+=(blah [0]=bork blarg zooj); typeset -p a; }; f' # Mksh does like Bash, but clobbers previous values rather than appending.
 +set -A a
 +typeset a[0]=bork
 +typeset a[1]=blarg
 +typeset a[2]=zooj
 +typeset a[3]=blah
 +</​code>​
 +  * In Bash and Zsh, the alternate value assignment parameter expansion (''​${arr[idx]:​=foo}''​) evaluates the subscript twice, first to determine whether to expand the alternate, and second to determine the index to assign the alternate to. See [[#​evaluation_order | evaluation order]]. <​code>​
 + $ : ${_[$(echo $RANDOM >&​2)1]:​=$(echo hi >&​2)}
 +13574
 +hi
 +14485
 +</​code>​
 +  * In Zsh, arrays are indexed starting at 1 in its default mode. Emulation modes are required in order to get any kind of portability.
 +  * Zsh and mksh do not support compound assignment arguments to ''​typeset''​.
 +  * Ksh88 didn't support modern compound array assignment syntax. The original (and most portable) way to assign multiple elements is to use the ''​set -A name arg1 arg2 ...''​ syntax. This is supported by almost all shells that support ksh-like arrays except for Bash. Additionally,​ these shells usually support an optional ''​-s''​ argument to ''​set''​ which performs lexicographic sorting on either array elements or the positional parameters. Bash has no built-in sorting ability other than the usual comparison operators. <​code>​
 + $ ksh -c 'set -A arr -- foo bar bork baz; typeset -p arr' # Classic array assignment syntax
 +typeset -a arr=(foo bar bork baz)
 + $ ksh -c 'set -sA arr -- foo bar bork baz; typeset -p arr' # Native sorting!
 +typeset -a arr=(bar baz bork foo)
 + $ mksh -c 'set -sA arr -- foo "​[3]=bar"​ "​[2]=baz"​ "​[7]=bork";​ typeset -p arr' # Probably a bug. I think the maintainer is aware of it.
 +set -A arr
 +typeset arr[2]=baz
 +typeset arr[3]=bar
 +typeset arr[7]=bork
 +typeset arr[8]=foo
 +</​code>​
 +  * Evaluation order for assignments involving arrays varies significantly depending on context. Notably, the order of evaluating the subscript or the value first can change in almost every shell for both expansions and arithmetic variables. See [[#​evaluation_order | evaluation order]] for details.
 +  * Bash 4.1.* and below cannot use negative subscripts to address array indexes relative to the highest-numbered index. You must use the subscript expansion, i.e. ''"​${arr[@]:​(-n):​1}"'',​ to expand the nth-last element (or the next-highest indexed after ''​n''​ if ''​arr[n]''​ is unset). In Bash 4.2, you may expand (but not assign to) a negative index. In Bash 4.3, ksh93, and zsh, you may both assign and expand negative offsets.
 +  * ksh93 also has an additional slice notation: ''"​${arr[n..m]}"''​ where ''​n''​ and ''​m''​ are arithmetic expressions. These are needed for use with multi-dimensional arrays.
 +  * Assigning or referencing negative indexes in mksh causes wrap-around. The max index appears to be ''​UINT_MAX'',​ which would be addressed by ''​arr[-1]''​.
 +  * So far, Bash's ''​-v var''​ test doesn'​t support individual array subscripts. You may supply an array name to test whether an array is defined, but can't check an element. ksh93'​s ''​-v''​ supports both. Other shells lack a ''​-v''​ test.
 +
 +==== Bugs ====
 +
 +  * **Fixed in 4.3** Bash 4.2.* and earlier considers each chunk of a compound assignment, including the subscript for globbing. The subscript part is considered quoted, but any unquoted glob characters on the right-hand side of the ''​[...]=''​ will be clumped with the subscript and counted as a glob. Therefore, you must quote anything on the right of the ''​=''​ sign.  This is fixed in 4.3, so that each subscript assignment statement is expanded following the same rules as an ordinary assignment. This also works correctly in ksh93. <​code>​
 +$ touch '​[1]=a';​ bash -c '​a=([1]=*);​ echo "​${a[@]}"'​
 +[1]=a
 +</​code>​ mksh has a similar but even worse problem in that the entire subscript is considered a glob. <​code>​
 +$ touch 1=a; mksh -c '​a=([123]=*);​ print -r -- "​${a[@]}"'​
 +1=a
 +</​code>​
 +  * **Fixed in 4.3** In addition to the above globbing issue, assignments preceding "​declare"​ have an additional effect on brace and pathname expansion. <​code>​
 +$ set -x; foo=bar declare arr=( {1..10} )
 ++ foo=bar
 ++ declare '​a=(1)'​ '​a=(2)'​ '​a=(3)'​ '​a=(4)'​ '​a=(5)'​
 +
 +$ touch xy=foo
 +$ declare x[y]=*
 ++ declare '​x[y]=*'​
 +$ foo=bar declare x[y]=*
 ++ foo=bar
 ++ declare xy=foo
 +</​code>​ Each word (the entire assignment) is subject to globbing and brace expansion. This appears to trigger the same strange expansion mode as ''​let'',​ ''​eval'',​ other declaration commands, and maybe more. 
 +  * **Fixed in 4.3** Indirection combined with another modifier expands arrays to a single word. <​code>​
 +$ a=({a..c}) b=a[@]; printf '<​%s>​ ' "​${!b}";​ echo; printf '<​%s>​ ' "​${!b/​%/​foo}";​ echo
 +<a> <b> <c>
 +<a b cfoo>
 +</​code>​
 +  * **Fixed in 4.3** Process substitutions are evaluated within array indexes. Zsh and ksh don't do this in any arithmetic context. <​code> ​
 +# print "​moo"​
 +dev=fd=1 _[1<​(echo moo >&​2)]=
 +
 +# Fork bomb
 +${dev[${dev='​dev[1>​(${dev[dev]})]'​}]}
 +</​code>​
 +
 +==== Evaluation order ====
 +
 +Here are some of the nasty details of array assignment evaluation order. You can use this [[https://​gist.github.com/​ormaaj/​4942297 | testcase code]] to generate these results.
 +
 +<​code>​
 +Each testcase prints evaluation order for indexed array assignment
 +contexts. Each context is tested for expansions (represented by digits) and
 +arithmetic (letters), ordered from left to right within the expression. The
 +output corresponds to the way evaluation is re-ordered for each shell:
 +
 +a[ $1 a ]=${b[ $2 b ]:=${c[ $3 c ]}}               No attributes
 +a[ $1 a ]=${b[ $2 b ]:=c[ $3 c ]}                  typeset -ia a
 +a[ $1 a ]=${b[ $2 b ]:=c[ $3 c ]}                  typeset -ia b
 +a[ $1 a ]=${b[ $2 b ]:=c[ $3 c ]}                  typeset -ia a b
 +(( a[ $1 a ] = b[ $2 b ] ${c[ $3 c ]} ))           No attributes
 +(( a[ $1 a ] = ${b[ $2 b ]:=c[ $3 c ]} ))          typeset -ia b
 +a+=( [ $1 a ]=${b[ $2 b ]:=${c[ $3 c ]}} [ $4 d ]=$(( $5 e )) ) typeset -a a
 +a+=( [ $1 a ]=${b[ $2 b ]:=c[ $3 c ]} [ $4 d ]=${5}e ) typeset -ia a
 +
 +bash: 4.2.42(1)-release
 +2 b 3 c 2 b 1 a
 +2 b 3 2 b 1 a c
 +2 b 3 2 b c 1 a
 +2 b 3 2 b c 1 a c
 +1 2 3 c b a
 +1 2 b 3 2 b c c a
 +1 2 b 3 c 2 b 4 5 e a d
 +1 2 b 3 2 b 4 5 a c d e
 +
 +ksh93: Version AJM 93v- 2013-02-22
 +1 2 b b a
 +1 2 b b a
 +1 2 b b a
 +1 2 b b a
 +1 2 3 c b a
 +1 2 b b a
 +1 2 b b a 4 5 e d
 +1 2 b b a 4 5 d e
 +
 +mksh: @(#)MIRBSD KSH R44 2013/02/24
 +2 b 3 c 1 a
 +2 b 3 1 a c
 +2 b 3 c 1 a
 +2 b 3 c 1 a
 +1 2 3 c a b
 +1 2 b 3 c a
 +1 2 b 3 c 4 5 e a d
 +1 2 b 3 4 5 a c d e
 +
 +zsh: 5.0.2
 +2 b 3 c 2 b 1 a
 +2 b 3 2 b 1 a c
 +2 b 1 a
 +2 b 1 a
 +1 2 3 c b a
 +1 2 b a
 +1 2 b 3 c 2 b 4 5 e
 +1 2 b 3 2 b 4 5
 </​code>​ </​code>​
  
 ===== See also ===== ===== See also =====
 +
   * [[syntax:​pe|Parameter expansion]] (contains sections for arrays)   * [[syntax:​pe|Parameter expansion]] (contains sections for arrays)
   * [[syntax:​ccmd:​classic_for]] (contains some examples to iterate over arrays)   * [[syntax:​ccmd:​classic_for]] (contains some examples to iterate over arrays)
Line 236: Line 439:
   * [[http://​mywiki.wooledge.org/​BashFAQ/​005|BashFAQ 005 - How can I use array variables?​]] - A very detailed discussion on arrays with many examples.   * [[http://​mywiki.wooledge.org/​BashFAQ/​005|BashFAQ 005 - How can I use array variables?​]] - A very detailed discussion on arrays with many examples.
   * [[http://​mywiki.wooledge.org/​BashSheet#​Arrays|BashSheet - Arrays]] - Bashsheet quick-reference on Greycat'​s wiki.   * [[http://​mywiki.wooledge.org/​BashSheet#​Arrays|BashSheet - Arrays]] - Bashsheet quick-reference on Greycat'​s wiki.
 +
 +<div hide> vim: set fenc=utf-8 ff=unix ts=4 sts=4 sw=4 ft=dokuwiki et wrap lbr: </​div>​