开发者

Search for lines in a file that contain de lines of a second file

开发者 https://www.devze.com 2023-03-11 17:23 出处:网络
So I have a first file with a ID in each line, for example: 458-12-345 466-44-3-223 578-4-58-1 599-478 854-52658

So I have a first file with a ID in each line, for example:

458-12-345
466-44-3-223
578-4-58-1
599-478
854-52658
955-12-32

Then I have a second file. It has a ID in each file followed by information, for example:

111-2457-1 0.2545 0.5484 0.6914 0.4222
112-4844-487 0.7475 0.4749 0.1114 0.8413
115-44-48-5 0.4464 0.8894 0.1140 0.1044

....

The first file only has 1000 lines, with the IDs of the info I need, while the second file has more than 200,000 lines.

I used the following bash command in a fedora with good results:

cat file1.txt | whil开发者_StackOverflow社区e read line; do cat file2.txt | egrep "^$line\ "; done > file3.txt

However I'm now trying to replicate the results in Ubuntu, and the output is a blank file. Is there a reason for this not to work in Ubuntu?

Thanks!


You can grep for several strings at once:

grep -f id_file data_file

Assuming that id_file contains all the IDs and data_file contains the IDs and data.


Typical job for awk:

awk 'FNR==NR{i[$1]=1;next} i[$1]{print}' file1 file2

This will print the lines from the second file that have an index in the first one. For even more speed, use mawk.


this line works fine for me in Ubuntu:

cat 1.txt | while read line; do cat 2.txt | grep "$line"; done

However, this may be slow as the second file (200000 lines) will be grepped 1000 times (number of lines in the first file)

0

精彩评论

暂无评论...
验证码 换一张
取 消