开发者

How to change what sed thinks is the line delimiter

开发者 https://www.devze.com 2023-03-25 08:55 出处:网络
As I\'m new with sed, I\'m having the fun of seeing that sed doesn\'t think that the \\r character is 开发者_运维知识库a valid line delimiter.

As I'm new with sed, I'm having the fun of seeing that sed doesn't think that the \r character is 开发者_运维知识库a valid line delimiter.

Does anyone know how to tell sed which character(s) I'd like it to use as the line delimiter when processing many lines of text?


You can specify it with awk's RS (record separator) variable: awk 'BEGIN {RS = "\r"} ...

Or you can convert with: tr '\r' '\n'


(For making the examples below clearer and less ambiguous, I'll use the od util extensively.)

It is not possible to do with a flag, for example. I bet the best solution is the one cited by the previous answers: using tr. If you have a file such as the one below:

$ od -xc slashr.txt
0000000      6261    0d63    6564    0d66                                
           a   b   c  \r   d   e   f  \r                                
0000010

There are various ways of using tr; the one we wanted is to pass two parameters for it - two different chars - and tr will replace the first parameter by the second one. Sending the file content as input for tr '\r' '\n', we got the following result:

$ tr '\r' '\n' < slashr.txt | od -xc 
0000000      6261    0a63    6564    0a66                                
           a   b   c  \n   d   e   f  \n                                
0000010

Great! Now we can use sed:

$ tr '\r' '\n' < slashr.txt | sed 's/^./#/'
#bc
#ef
$ tr '\r' '\n' < slashr.txt | sed 's/^./#/' | od -xc
0000000      6223    0a63    6523    0a66                                
           #   b   c  \n   #   e   f  \n                                
0000010

But I presume you need to use \r as the line delimiter, right? In this case, just use tr '\n' '\r' to reverse the conversion:

$ tr '\r' '\n' < slashr.txt | sed 's/^./#/' | tr '\n' '\r' | od -xc
0000000      6223    0d63    6523    0d66                                
           #   b   c  \r   #   e   f  \r                                
0000010


As far as I know, you can't. What's wrong with using a newline as the delimiter? If your input has DOS-style \r\n line endings it can be preprocessed to remove them and, if necessary, they can be returned afterwards.

0

精彩评论

暂无评论...
验证码 换一张
取 消