I have this file "file.txt" which I want to split into many smaller ones. This a piece of it:
0 id:2293 7:0.78235 12:0.69205 17:0.79421 21:0.77818 ..
4 id:2293 7:0.78235 8:0.97904 12:0.69205 17:0.31709 ..
1 id:2294 7:0.78235 8:0.90994 17:0.49058 21:0.59326 ..
Each line of the file has an id field which looks like "id:1" for a line belonging to id 1.
For each id in the file, I like to create a file named idid
.txt and put all lines that belong to thi开发者_JAVA百科s id in that file.
My brute force bash script solution reads as follows.
count=1
while [ $count -lt 19945 ]
do
cat file.txt | grep "id:$count " >> ./sets/id$count.txt
count='expr $count + 1'
done
Now this is very inefficient as I have do read through the file about 20.000 times. Is there a way to do the same operation with only one pass through the file? - What I'm probably asking for is a way to use the value that matches for a regular expression to name the associated output file.
$ cat file
0 id:2293 7:0.78235 12:0.69205 17:0.79421 21:0.77818 ..
4 id:2293 7:0.78235 8:0.97904 12:0.69205 17:0.31709 ..
1 id:2294 7:0.78235 8:0.90994 17:0.49058 21:0.59326 ..
$ awk -F"[: ]" '{print $0 > "id_"$3".txt"}' file
$ more id_2293.txt
0 id:2293 7:0.78235 12:0.69205 17:0.79421 21:0.77818 ..
4 id:2293 7:0.78235 8:0.97904 12:0.69205 17:0.31709 ..
$ more id_2294.txt
1 id:2294 7:0.78235 8:0.90994 17:0.49058 21:0.59326 ..
You can build a solution similar to this
Creating multiple csv files from data within a csv file
Try this AWK script:
#!/usr/bin/awk -f
{
if (match($0, /id:([0-9]+)/, a))
print $0 >> "file" a[1] ".txt";
}
精彩评论