开发者

grep -f alternative sed?awk?

开发者 https://www.devze.com 2023-03-01 01:01 出处:网络
file1 = 95000 file2 = 4500000 I want to filter out file1 entries from file2. egrep -f file开发者_如何转开发1 file2
file1 = 95000
file2 = 4500000

I want to filter out file1 entries from file2.

egrep -f file开发者_如何转开发1 file2

takes ages to complete. Is there an alternative ? sed? awk?

Thanks


sure, you can use awk. Put file2 entries into an array. Then iterate file1, each time finding those entries in the array.

awk 'FNR==NR{a[$0];next}($0 in a)' file2 file1

Play around with these options to get what you want

awk 'FNR==NR{a[$0];next}(!($0 in a))' file2 file1
awk 'FNR==NR{a[$0];next}(!($0 in a))' file1 file2
awk 'FNR==NR{a[$0];next}($0 in a)' file1 file2


I don't think grep -f is really meant to work with a filter file of that size so some sort of database backed solution might be your best bet.

You could load both files line-by-line into an SQLite database and then do a simple bit of SQL something like this:

SELECT line FROM file2
EXCEPT
SELECT line FROM file1

and dump them back out. You could do all of that straight from the command line with SQLite.

0

精彩评论

暂无评论...
验证码 换一张
取 消