bash中的quoting的解释

命令行上字符分 literal 和 meta

应该很清楚当你在 shell prompt(命令行提示符) 后面敲打键盘、
直到按下 Enter 的时候,你输入的文字就是 command line 了,
然后 shell 才会以进程的方式执行你所交给它的命令。
但是,你又可知道:你在 command line 输入的每一个文字,
对 shell 来说,是有类别之分的呢?

简单而言(我不敢说这是精确的定议,注一),
command line 的每一个 charactor ,分为如下两种:

literal :也就是普通纯文字,对 shell 来说没特殊功能。

meta :对 shell 來说,具有特定功能的特殊保留字元,也叫元字符。(其他地方我们听到过元数据,metadata类似概念,就是描述数据的数据,就是一种比较特殊的数据。)

两个常见的meta IFS 和 CR

我們在 command line 中碰到两个几乎每次都会碰到的 meta :

1
2
IFS:由 <space> 或 <tab> 或 <enter> 三者之一組成(我們常用 space )。

1
2
CR:由 <enter> 產生。

1
2
3
IFS 是用来拆解 command line 的每一个词(word)用的,因为 shell command line 是按词来处理的。

而 CR 则是用来结束 command line 用的,这也是为何我们敲 <enter> 命令就会跑的原因。

除了 IFS 与 CR ,常用的 meta 还有:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
= :  设定变量。

$ : 作变量或运算替换(请不要与 shell prompt 搞混了)。

>:重导向 stdout。

< :重导向 stdin。

|:命令管线。

& :重导向 file descriptor ,或将命令置于背境执行。

( ):将其内的命令置于 nested subshell 执行,或用于运算或命令替换。

{ }:将其内的命令置于 non-named function 中执行,或用在变量替换的界定范围。

; :在前一个命令结束时,而忽略其返回值,继续执行下一个命令。
&& :在前一个命令结束时,若返回值为 true,继续执行下一个命令。

|| :在前一个命令结束时,若返回值为 false,继续执行下一个命令。

!:执行 history 列表中的命令

常用的 quoting 有如下三种方法

单引号

1
2
hard quote:' ' (单引号),
凡在 hard quote 中的所有 meta 均被关闭。

双引号

1
2
soft quote: " " (双引号),
在 soft quoe 中大部份 meta 都会被关闭,但某些则保留(如 $ )。(注二)

反斜线(转义字符)

1
escape : \ (反斜线),只有紧接在 escape (跳脱字符)之后的单一 meta 才被关闭。
1
2
3
4
Single-quoted 	'hello' 	Verbatim contents
Double-quoted "hello" Interpolation enabled
ANSI-C quoted $'hello' Some backslash sequences are interpreted
Translatable $"hello" Looks up translation based on locale settings

QUOTING 就是引号的意思

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Quoting is used to remove the special meaning of certain characters or words to the shell.  Quoting can be used to disable special treatment for special characters, to prevent reserved words  from
being recognized as such, and to prevent parameter expansion.

Each of the metacharacters listed above under DEFINITIONS has special meaning to the shell and must be quoted if it is to represent itself.

When the command history expansion facilities are being used (see HISTORY EXPANSION below), the history expansion character, usually !, must be quoted to prevent history expansion.

There are three quoting mechanisms: the escape character, single quotes, and double quotes.

A non-quoted backslash (\) is the escape character. It preserves the literal value of the next character that follows, with the exception of <newline>. If a \<newline> pair appears, and the
backslash is not itself quoted, the \<newline> is treated as a line continuation (that is, it is removed from the input stream and effectively ignored).

Enclosing characters in single quotes preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.

Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of $, `, \, and, when history expansion is enabled, !. When the shell is
in posix mode, the ! has no special meaning within double quotes, even when history expansion is enabled. The characters $ and ` retain their special meaning within double quotes. The backslash
retains its special meaning only when followed by one of the following characters: $, `, ", \, or <newline>. A double quote may be quoted within double quotes by preceding it with a backslash.
If enabled, history expansion will be performed unless an ! appearing in double quotes is escaped using a backslash. The backslash preceding the ! is not removed.

The special parameters * and @ have special meaning when in double quotes (see PARAMETERS below).

Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if
present, are decoded as follows:
\a alert (bell)
\b backspace
\e
\E an escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\' single quote
\" double quote
\? question mark
\nnn the eight-bit character whose value is the octal value nnn (one to three octal digits)
\xHH the eight-bit character whose value is the hexadecimal value HH (one or two hex digits)
\uHHHH the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHH (one to four hex digits)
\UHHHHHHHH
the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHHHHHH (one to eight hex digits)
\cx a control-x character

The expanded result is single-quoted, as if the dollar sign had not been present.

A double-quoted string preceded by a dollar sign ($"string") will cause the string to be translated according to the current locale. If the current locale is C or POSIX, the dollar sign is ig‐
nored. If the string is translated and replaced, the replacement is double-quoted.

对 $‘ 和 $” 的解释

1
2
3
4
5
6
7
8
9
shopt -u extquote

echo "${v:-$"translate me"}"
echo "${v:-$'ab\ncd'}"

shopt -s extquote

echo "${v:-$"translate me"}"
echo "${v:-$'ab\ncd'}"
1
2
3
4
5
6
7
8
这里使用了 $' 美元符和单引号,里面的 \t 是一个字符tab。 如果最前面不用$的话,这里的 \t 是2个字符
$ echo $'foo\tbar' >out.tsv
$ od -c out.tsv
0000000 f o o \t b a r \n
0000010

$' string ' 形式的字符序列被视为一种特殊类型的单引号。该序列扩展为 string ,
并按照 ANSI C 标准的指定替换 string 中的反斜杠转义字符。反斜杠转义序列(如果存在)按如下方式解码:
1
2
3
对于 $" 美元符和双引号,介绍说了,是可以用来根据语言不同进行翻译处理的,比较复杂。

https://www.kurokatta.org/grumble/2023/11/bash-translated-strings