Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
syntax:quoting [2013/07/07 10:03] ormaaj [ANSI C like strings] portability |
syntax:quoting [2019/10/30 17:39] (current) ersen add tilde exp. to weak quoting |
||
---|---|---|---|
Line 3: | Line 3: | ||
{{keywords>bash shell scripting quoting quotes escape backslash marks singlequotes doublequotes single double}} | {{keywords>bash shell scripting quoting quotes escape backslash marks singlequotes doublequotes single double}} | ||
- | Quoting and escaping is really an important way to influence the way, Bash treats your input. There are three recognized types: | + | Quoting and escaping are important, as they influence the way Bash acts upon your input. There are three recognized types: |
* **per-character escaping** using a backslash: ''\$stuff'' | * **per-character escaping** using a backslash: ''\$stuff'' | ||
* **weak quoting** with double-quotes: ''"stuff"'' | * **weak quoting** with double-quotes: ''"stuff"'' | ||
* **strong quoting** with single-quotes: ''<nowiki>'stuff'</nowiki>'' | * **strong quoting** with single-quotes: ''<nowiki>'stuff'</nowiki>'' | ||
- | All three forms have the very same purpose: **They give you general control over parsing, expansion and expansion's results.** | + | All three forms have the very same purpose: **They give you general control over parsing, expansion and expansion results.** |
- | Beside these common basic variants, there are some more special quoting methods (like interpreting ANSI-C escapes in a string) you'll meet below. | + | Besides these basic variants, there are some special quoting methods (like interpreting ANSI-C escapes in a string) you'll meet below. |
- | :!: **ATTENTION** :!: These quote characters (''"'', double quote and ''<nowiki>'</nowiki>'', single quote) are a syntax element that influences parsing. It is not related to eventual quote characters that are passed as text to the commandline! The syntax-quotes are removed before the command is called! Look: | + | :!: **ATTENTION** :!: The quote characters (''"'', double quote and ''<nowiki>'</nowiki>'', single quote) are a syntax element that influence parsing. It is not related to the quote characters passed as text to the command line! The syntax quotes are removed before the command is called! Example: |
<code> | <code> | ||
### NO NO NO: this passes three strings: | ### NO NO NO: this passes three strings: | ||
Line 21: | Line 21: | ||
somecommand $MYARG | somecommand $MYARG | ||
- | ### THIS IS NOT (!!!!) THE SAME AS ### | + | ### THIS IS NOT (!) THE SAME AS ### |
command "my multiword argument" | command "my multiword argument" | ||
Line 31: | Line 31: | ||
===== Per-character escaping ===== | ===== Per-character escaping ===== | ||
- | Per-character escaping is useful in different places, also here, on expansions and substitutions. In general, a character that has a special meaning for Bash, like the dollar-sign (''$'') to introduce some expansion types, can be masked to not have that special meaning using the backslash: | + | Per-character escaping is useful in on expansions and substitutions. In general, a character that has a special meaning to Bash, like the dollar-sign (''$'') can be masked to not have a special meaning using the backslash: |
<code> | <code> | ||
echo \$HOME is set to \"$HOME\" | echo \$HOME is set to \"$HOME\" | ||
</code> | </code> | ||
- | * ''\$HOME'' won't expand because it's not variable expansion syntax anymore | + | * ''\$HOME'' won't expand because it's not in variable-expansion syntax anymore |
- | * The quotes are masked with the backslash to be literal - otherwise they would be interpreted by Bash | + | * The backslash changes the quotes into literals - otherwise Bash would interpret them |
The sequence ''\<newline>'' (an unquoted backslash, followed by a ''<newline>'' character) is interpreted as **line continuation**. It is removed from the input stream and thus effectively ignored. Use it to beautify your code: | The sequence ''\<newline>'' (an unquoted backslash, followed by a ''<newline>'' character) is interpreted as **line continuation**. It is removed from the input stream and thus effectively ignored. Use it to beautify your code: | ||
Line 51: | Line 51: | ||
</code> | </code> | ||
- | The backslash can be used to mask every character that has a special meaning for bash. __Exception:__ Inside a single-quoted string (see below). | + | The backslash can be used to mask every character that has a special meaning to bash. __Exception:__ Inside a single-quoted string (see below). |
Line 60: | Line 60: | ||
Inside a weak-quoted string there's **no special interpretion of**: | Inside a weak-quoted string there's **no special interpretion of**: | ||
- | * spaces as word-separators (on inital commandline splitting and on [[syntax:expansion:wordsplit | word splitting]]!) | + | * spaces as word-separators (on inital command line splitting and on [[syntax:expansion:wordsplit | word splitting]]!) |
* single-quotes to introduce strong-quoting (see below) | * single-quotes to introduce strong-quoting (see below) | ||
* characters for pattern matching | * characters for pattern matching | ||
+ | * tilde expansion | ||
* pathname expansion | * pathname expansion | ||
* process substitution | * process substitution | ||
Line 76: | Line 77: | ||
echo "Your PATH is: $PATH" | echo "Your PATH is: $PATH" | ||
</code> | </code> | ||
- | Will work as expected. ''$PATH'' is expanded, because it's only double- (weak-) quoted. | + | Will work as expected. ''$PATH'' is expanded, because it's double (weak) quoted. |
- | If a backslash in double-quotes ("weak quoting") occurs, there are 2 ways to deal with it | + | If a backslash in double quotes ("weak quoting") occurs, there are 2 ways to deal with it |
* if the baskslash is followed by a character that would have a special meaning even inside double-quotes, the backslash is removed and the following character looses its special meaning | * if the baskslash is followed by a character that would have a special meaning even inside double-quotes, the backslash is removed and the following character looses its special meaning | ||
* if the backslash is followed by a character without special meaning, the backslash is not removed | * if the backslash is followed by a character without special meaning, the backslash is not removed | ||
Line 89: | Line 90: | ||
Strong quoting is very easy to explain: | Strong quoting is very easy to explain: | ||
- | Inside a single-quoted string **nothing(!!!!)** is interpreted, except the single-quote that closes the quoting. | + | Inside a single-quoted string **nothing** is interpreted, except the single-quote that closes the string. |
<code> | <code> | ||
echo 'Your PATH is: $PATH' | echo 'Your PATH is: $PATH' | ||
</code> | </code> | ||
- | That ''$PATH'' won't be expanded, it's interpreted as normal ordinary text, because it's surrounded by strong quotes. | + | ''$PATH'' won't be expanded, it's interpreted as ordinary text because it's surrounded by strong quotes. |
- | In practise that means, to produce a text like ''Here's my test...'' as a single-quoted string, you have to leave and re-enter the single-quoting to get the character "''<nowiki>'</nowiki>''" as literal text: | + | In practise that means, to produce a text like ''Here's my test...'' as a single-quoted string, you have to leave and re-enter the single quoting to get the character "''<nowiki>'</nowiki>''" as literal text: |
<code> | <code> | ||
# WRONG | # WRONG | ||
Line 111: | Line 112: | ||
===== ANSI C like strings ===== | ===== ANSI C like strings ===== | ||
- | There's another quoting mechanism, Bash provides: Strings that are scanned for ANSI C like escape sequences. The Syntax is | + | Bash provides another quoting mechanism: Strings that contain ANSI C-like escape sequences. The Syntax is: |
<code> | <code> | ||
$'string' | $'string' | ||
</code> | </code> | ||
where the following escape sequences are decoded in ''string'': | where the following escape sequences are decoded in ''string'': | ||
- | ^ Code ^ Meaning ^ | + | ^ Code ^ Meaning ^ |
- | | ''\a'' | terminal alert character (bell) | | + | | ''\"'' | double-quote | |
- | | ''\b'' | backspace | | + | | ''\%%'%%'' | single-quote | |
- | | ''\e'' | escape (ASCII 033) | | + | | ''\\'' | backslash | |
- | | ''\E'' | escape (ASCII 033) | | + | | ''\a'' | terminal alert character (bell) | |
- | | ''\f'' | form feed | | + | | ''\b'' | backspace | |
- | | ''\n'' | newline | | + | | ''\e'' | escape (ASCII 033) | |
- | | ''\r'' | carriage return | | + | | ''\E'' | escape (ASCII 033) **\E is non-standard** | |
- | | ''\t'' | horizontal tab | | + | | ''\f'' | form feed | |
- | | ''\v'' | vertical tab | | + | | ''\n'' | newline | |
- | | ''\\'' | backslash | | + | | ''\r'' | carriage return | |
- | | ''\%%'%%'' | single quote | | + | | ''\t'' | horizontal tab | |
- | | ''\nnn'' | the eight-bit character whose value is the octal value nnn (one to three digits) | | + | | ''\v'' | vertical tab | |
- | | ''\xHH'' | the eight-bit character whose value is the hexadecimal value HH (one or two hex digits) | | + | | ''\cx'' | a control-x character, for example, ''%%$'\cZ'%%'' to print the control sequence composed of Ctrl-Z (''^Z'') | |
- | | ''\cx'' | a control-x character, for example ''%%$'\cZ'%%'' to print the control sequence composed by Ctrl-Z (''^Z'') | | + | | ''\uXXXX'' | Interprets ''XXXX'' as a hexadecimal number and prints the corresponding character from the character set (4 digits) (Bash 4.2-alpha) | |
- | | ''\uXXXX'' | Interprets ''XXXX'' as hexadecimal number and prints the corresponding character from the character set (4 digits) (Bash 4.2-alpha) | | + | | ''\UXXXXXXXX'' | Interprets ''XXXX'' as a hexadecimal number and prints the corresponding character from the character set (8 digits) (Bash 4.2-alpha) | |
- | | ''\uXXXXXXXX'' | Interprets ''XXXX'' as hexadecimal number and prints the corresponding character from the character set (8 digits) (Bash 4.2-alpha) | | + | | ''\nnn'' | the eight-bit character whose value is the octal value nnn (one to three digits) | |
+ | | ''\xHH'' | the eight-bit character whose value is the hexadecimal value HH (one or two hex digits) | | ||
- | This is especially useful when you want to give special characters as arguments to some programs, like giving a newline to sed. | ||
- | The resulting text is treated as if it was **single-quoted**. No further expansions happen. | + | This is especially useful when you want to pass special characters as arguments to some programs, like passing a newline to sed. |
+ | |||
+ | The resulting text is treated as if it were **single-quoted**. No further expansion happens. | ||
The ''<nowiki>$'...'</nowiki>'' syntax comes from ksh93, but is portable to most modern shells including pdksh. A [[http://austingroupbugs.net/view.php?id=249#c590 | specification]] for it was accepted for SUS issue 7. There are still some stragglers, such as most ash variants including dash, (except busybox built with "bash compatibility" features). | The ''<nowiki>$'...'</nowiki>'' syntax comes from ksh93, but is portable to most modern shells including pdksh. A [[http://austingroupbugs.net/view.php?id=249#c590 | specification]] for it was accepted for SUS issue 7. There are still some stragglers, such as most ash variants including dash, (except busybox built with "bash compatibility" features). | ||
Line 146: | Line 149: | ||
echo $"generating database..." | echo $"generating database..." | ||
</code> | </code> | ||
- | means I18N. If there is a translation available for that string, it is used instead of the given text. If not, or if the locale is ''C''/''POSIX'', the dollar sign simply is ignored, which results in a normal double-quoted string. | + | means I18N. If there is a translation available for that string, it is used instead of the given text. If not, or if the locale is ''C''/''POSIX'', the dollar sign is simply ignored, which results in a normal double quoted string. |
- | If the string was replaced (translated), the result is double-quoted. | + | If the string was replaced (translated), the result is double quoted. |
- | In case you're a C-programmer: The purpose of ''$"..."'' is the same as for ''gettext()'' or ''_()''. | + | In case you're a C programmer: The purpose of ''$"..."'' is the same as for ''gettext()'' or ''_()''. |
For useful examples to localize your scripts, please see [[http://tldp.org/LDP/abs/html/localization.html | Appendix I of the Advanced Bash Scripting Guide]]. | For useful examples to localize your scripts, please see [[http://tldp.org/LDP/abs/html/localization.html | Appendix I of the Advanced Bash Scripting Guide]]. | ||
- | **Attention:** There is a security hole. Please read in [[http://www.gnu.org/software/gettext/manual/html_node/bash.html | the gettext documentation]] | + | **Attention:** There is a security hole. Please read [[http://www.gnu.org/software/gettext/manual/html_node/bash.html | the gettext documentation]] |
===== Common mistakes ===== | ===== Common mistakes ===== | ||
Line 161: | Line 164: | ||
==== String lists in for-loops ==== | ==== String lists in for-loops ==== | ||
- | The [[syntax:ccmd:classic_for | classic for-loop]] uses a list of words to iterate through. This list can - of course - also be in a variable: | + | The [[syntax:ccmd:classic_for | classic for loop]] uses a list of words to iterate through. The list can also be in a variable: |
<code> | <code> | ||
mylist="DOG CAT BIRD HORSE" | mylist="DOG CAT BIRD HORSE" | ||
Line 172: | Line 175: | ||
done | done | ||
</code> | </code> | ||
- | Why? Due to the double-quotes, technically, the expansion of ''$mylist'' is seen as **one word**. The for-loop iterates exactly one time, with ''animal'' set to the whole list. | + | Why? Due to the double-quotes, technically, the expansion of ''$mylist'' is seen as **one word**. The for loop iterates exactly one time, with ''animal'' set to the whole list. |
**__RIGHT__** way to iterate through this list: | **__RIGHT__** way to iterate through this list: | ||
Line 183: | Line 186: | ||
==== Working out the test-command ==== | ==== Working out the test-command ==== | ||
- | The command ''test'' or ''[ ... ]'' ([[commands:classictest | the classic test command]]) is a normal ordinary command, so normal ordinary syntax rules apply. Let's take string comparison as example: | + | The command ''test'' or ''[ ... ]'' ([[commands:classictest | the classic test command]]) is an ordinary command, so ordinary syntax rules apply. Let's take string comparison as an example: |
<code> | <code> | ||
[ WORD = WORD ] | [ WORD = WORD ] | ||
</code> | </code> | ||
- | The '']'' at the end is a convenience; if you type ''which ['' you will see that there is in fact a binary with that name. So if we were writing this as just a test command it would be: | + | The '']'' at the end is a convenience; if you type ''which ['' you will see that there is in fact a binary file with that name. So if we were writing this as a test command it would be: |
<code> | <code> | ||
Line 195: | Line 198: | ||
- | When you compare variables, it's wise to quote them. Let's invent a test string with spaces: | + | When you compare variables, it's wise to quote them. Let's create a test string with spaces: |
<code> | <code> | ||
mystring="my string" | mystring="my string" | ||
Line 203: | Line 206: | ||
<code> | <code> | ||
- | [ $mystring = testword ] # WRONG!!! | + | [ $mystring = testword ] # WRONG! |
</code> | </code> | ||
- | This fails! These are too much arguments for the string comparison test. After all expansions performed you really execute: | + | This fails! These are too many arguments for the string comparison test. After expansion is performed, you really execute: |
<code> | <code> | ||
[ my string = testword ] | [ my string = testword ] | ||
Line 214: | Line 217: | ||
So what you really want to do is: | So what you really want to do is: | ||
<code> | <code> | ||
- | [ "$mystring" = testword ] # RIGHT!!! | + | [ "$mystring" = testword ] # RIGHT! |
</code> | </code> | ||