rule based file parsing_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-01-02 05:14 出处：网络

I need to parse a file line by line on given rules. Here is a requirement. file can have multiple lines with different data..

相关专题：ruby

I need to parse a file line by line on given rules.

Here is a requirement.

file can have multiple lines with different data..

01200344545143554145556524341232131
1120034454514355414555652434123213101200344545143554145556524341232131
2120034454514

and rules can be like this.

I am looking for any language which can do this in a fast manner with a very long file size like >2 GB.

Appreciate all the help in advance.

Thanks

It doesn't appear in your list of tags, but I'd use:

sed -n -e '/^0/w /tmp/record0.dat' \
       -e '/^1/w /tmp/record1.dat' \
       -e '/^2/w /tmp/record2.dat' "$@"

You can also do it in the other languages, but for conciseness and probable correctness, in this case, sed is hard to beat.

This will work regardless of the value of the first character so it scales without having to add more rules:

awk '{c=substr($0,0,1); print $0 > "/tmp/record" c ".dat"}' inputfile.dat

awk -vFS= 'NF{print $0>"/tmp/record"$1".dat"}' file

rule based file parsing