开发者

Use Unix JOIN command to merge two files

开发者 https://www.devze.com 2023-03-13 23:45 出处:网络
This isn\'t working like I expect, despite all research. I must be missing something... File 1... # cat file1.csv

This isn't working like I expect, despite all research. I must be missing something...

File 1...

# cat file1.csv
1       123     JohnDoe
1       456     BobDylan
1       789     BillyJean

File 2...

# cat file2.csv
111     123     DaddyDoe
222     456     DaddyDylan
666     777     Stranger
555     789     DaddyJean
444     888     Stranger
333     999     Stranger

I am trying to join on both the second fields. When I perform a left outer join and only include fields from the first file, everything seems dandy.

# join -1 2 -2 2 -a 1 -o 1.2 1.3 file1.csv file2.csv
123 JohnDoe
456 BobDylan
789 BillyJean

But as soon as I include a field from the second file, it all goes wack.

# join -1 2 -2 2 -a 1 -o 1.2 1.3 2.3 file1.csv file2.csv
 DaddyDoeoe
 开发者_运维技巧DaddyDylann
789 BillyJean DaddyJean

The last line looks perfect! What's up with the others? Any idea? Thanks in advance!

EDIT: Here is my attempt with actual CSVs.

# cat file1.csv
1,123,JohnDoe
1,456,BobDylan
1,789,BillyJean

# cat file2.csv
111,123,DaddyDoe
222,456,DaddyDylan
666,777,Stranger
555,789,DaddyJean
444,888,Stranger
333,999,Stranger

# join -t, -1 2 -2 2 -a 1 -o 1.2 1.3 2.3 file1.csv file2.csv
,DaddyDoeoe
,DaddyDylann
789,BillyJean,DaddyJean


You used the -a option.

-a file_number

In addition to the default output, produce a line for each unpairable line in file file_number.

In addition, the odd overwriting behavior indicates that you have embedded carriage returns (\r). I would examine those fies closely with cat -v or a text editor that doesn't try to be "smart" about Windows files.


Use the correct 'field' separator in your command.

When I changed your data to true csv, and used

join -t, -1 2 -2 2 -a 1 -o 1.2 1.3 2.3 file1.csv file2.csv
# ---^^^

I got

123,JohnDoe,DaddyDoe
456,BobDylan,DaddyDylan
789,BillyJean,DaddyJean

I hope this helps.


If you are doing this command line why not use paste? paste -d, file1 file2 >> file3

the -d arugment is the delimiter

0

精彩评论

暂无评论...
验证码 换一张
取 消