开发者

Escaping sed strings correctly

开发者 https://www.devze.com 2022-12-16 19:05 出处:网络
I have a regex and replacement pattern that have both been tested in Notepad++ on my input data and work correctly.When I put them into a sed expression, however, nothing gets matched.

I have a regex and replacement pattern that have both been tested in Notepad++ on my input data and work correctly. When I put them into a sed expression, however, nothing gets matched.

Here is the sed command:

 # SEARCH = ([a-zA-Z0-9.]+) [0-9] (.*)
 # REPLACE = \2 (\1)

 sed -e 's/\([a-zA-Z0-9.]+\) [0-9] \(.*\)/\2 \(\1\)/g'

Here is a sampling of the data:

jdoe 1 Doe, John
jad 1 Doe, Jane
smith 2 Smith, Jon

and the desired output:

Doe, John  (jdoe)
Doe, Jane  (jad)
Smith, Jon (smith)

I have tried removin开发者_如何学Gog and adding escapes to different characters in the sed expression, but either get nothing matched or something along the lines of:

sed: -e expression #1, char 42: invalid reference \2 on `s' command's RHS

How can I get this escaped correctly?


I usually find it easier to use the -r switch as this means that escaping is similar to that of most other languages:

sed -r 's/([a-zA-Z0-9.]+) [0-9] (.*)/\2 (\1)/g' file1.txt


A few warnings and additions to what everyone else has already said:

  1. The -r option is a GNU extension to enable extended regular expressions. BSD derived sed's use -E instead.
  2. Sed and Grep use Basic Regular Expressions
  3. Awk uses Extended Regular Expressions
  4. You should become comfortable with the POSIX specifications such as IEEE Std 1003.1 if you want to write portable scripts, makefiles, etc.

I would recommend rewriting the expression as

's/\([a-zA-Z0-9.]\{1,\}\) [0-9] \(.*\)/\2 (\1)/g'

which should do exactly what you want in any POSIX compliant sed. If you do indeed care about such things, consider defining the POSIXLY_CORRECT environment variable.


The plus sign needs to be escaped when not using the -r switch.


Using awk is much simpler...:

cat test.txt | awk '{ print $3 " " $4 " " "("$1")" }'

Output:

Doe, John (jdoe)
Doe, Jane (jad)
Smith, Jon (smith)

See man awk 1


$ sed -e 's/\([a-zA-Z0-9.].*\) [0-9] \(.*\)/\2 \(\1\)/g' file
Doe, John (jdoe)
Doe, Jane (jad)
Smith, Jon (smith)
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号