开发者

How to search duplicate users in two files and then print those lines? [duplicate]

开发者 https://www.devze.com 2023-02-21 10:00 出处:网络
This question already has answers here: How to search duplicate users in two files and then print those lines?
This question already has answers here: How to search duplicate users in two files and then print those lines? (2 answers) Closed 8 years ago.

I have two files: FILE1 and FILE2

FILE1:

user1        1.1.1.1
user2        2.2.2.2
user3        3.14.14.3
user4        4.4.4.4
user5        198.222.222.222

FILE2:

user1        99.22.54.214
user66       45.22.88.88
user99       44.55.66.66
user4        8.8.8.8
user39       54.54.54.54
user2        2.2.2.2

OUTPUT FILE:

user1        1.1.1.1
user1        99.22.54.214
user2        2.2.2.2
user4        4.4.4.4
user4        8.8.开发者_StackOverflow中文版8.8

I tried with a for loop but with particular succes.. Can anyone write me a code for this? Thx!


while read user ip ; do match=`grep -E "$user " file2 2>/dev/null` ; if [ $? -eq 0 ] ; then echo $user $ip ; echo $match ; fi ; done < file1
user1 1.1.1.1
user1 99.22.54.214
user2 2.2.2.2
user2 2.2.2.2
user4 4.4.4.4
user4 8.8.8.8


fgrep -h -f <(cut -d ' ' -f 1 FILE1 FILE2 | sort | uniq -d) FILE1 FILE2 | sort -k1

That cuts out the first field from both files, then searches for duplicates, then searches both files for the related lines. But you can do this with AWK in several ways too... e.g. somthing like:

awk '{if ( users[$1] = "" ) { users[$1]=$2 ; printed[$1]=0} else { if (printed[$1]==0) {print $1 users[$1] ; printed[$1]=1 ; print $0 } else { print $0 } }' | sort

When it first sees the user, saves the line, then upon next (times) seeing the same user, checks if the very first occurence was printed already, and if not it prints the first occurence, then the actual. If the first occurence was printed, then prints only the actual line.

HTH


$ awk 'FNR==NR{a[$1]=$0;next}($1 in a){print $0;print a[$1]} ' file2 file1 | uniq
user1        1.1.1.1
user1        99.22.54.214
user2        2.2.2.2
user4        4.4.4.4
user4        8.8.8.8


Here is my attempt, which preserves the spaces within a line. First, create a script called showdup.awk:

# showdup.awk
$1 != lastkey {
    # Flush out the last set: only print out if the last set contains
    # more than one lines
    if (count > 1) {
        for (i = 0; i < count; i++) {
            print savedLine[i]
        }
    }

    # Reset the count
    count = 0
}

{
    savedLine[count++] = $0;
    lastkey = $1;
}

Next, invoke showdup.awk:

cat file1 file2|sort|awk -f showdup.awk


Take a look at the unix command uniq

http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq

Assuming space characters and not tabs in the file something like this may work

cat file1 file2 | sort | uniq -D -w6 | uniq >file3

Sorry, corrected the above mistake...

0

精彩评论

暂无评论...
验证码 换一张
取 消