开发者

Parsing pipe delimited input in awk

开发者 https://www.devze.com 2023-03-24 13:01 出处:网络
Have seen many posts asking similar question.Ca开发者_如何学JAVAn\'t get it working. Input looks like:

Have seen many posts asking similar question. Ca开发者_如何学JAVAn't get it working.

Input looks like:

<field one with spaces>|<field two with spaces>

Trying to parse with awk.

Have tried many variants from excellent posts:

FS = "^[\x00- ]*|[\x00- ]*[|][\x00- ]*|[\x00- ]*$";
FS = "^[\x00- ]*|[\x00- ]*\|[\x00- ]*|[\x00- ]*$";
FS = "^[\x00- ]*|[\x00- ]*\\|[\x00- ]*|[\x00- ]*$";

Still can't get the pipe delimiter to work.

Using CentOS.

Any help?


 echo "field one has spaces | field two has spaces" \
 | awk '
   BEGIN {
      FS="|" 
 }
 {
   print $2
   print $1
   # or what ever you want
 }'

 #output

  field two has spaces
  field one has spaces

You can also reduce this to

awk -F'|' {
    print $2
    print $1
}'

Edit Also, not all awks can take a multi-character regex for the FS value.

Edit2 Somehow I missed this originally, but I see you are trying to include \x00 in the char classes pre and post of the | char. I assume you mean for \x00 == null char? I don't think you're going to be able to have awk parse a file with null chars embedded. You could prep-rocess your input like

 tr '\x00'   ' ' < file.txt > spacesForNulls.txt 

OR delete them altogether with

tr -d '\x00' < file.txt > deletedNulls.txt

and eliminate that part of your regex. But as above, some awk don't support regex for the FS value. And, I don't use the tr trick very much, you may find that it requires a slightly different notation for the null char, depending on your version of tr.

I hope this helps.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号