I have a tab delimited row , I want to sort the lines only with respect to the first column.
can this be achieved using Unix sort?
u.s 2||`` U.S ''||527 || 107
u.s. 2||开发者_如何转开发`` U.S. ''||532 || 107
us. 2||Us.||532 || 112
u.s. 2||U.s.||629 || 112
us. 2||US.||6444 || 112
us 2||US||8655700 || 27
u.s 2||U.s||992 || 112
It has to sort using
u.s
u.s.
us.
u.s.
us.
us
u.s
Dots are not being considered by the sort. The above one is sorted after the use of sort -k1. u.s. and u.s. are not together.
If you're sorting by the first field, there's no reason to specify the key unless you want to ignore the rest of the line. If you want to do that you need to do -k1,1
. You will need to specify the C
locale (or you can use the synonymous POSIX
locale to not ignore the periods.
LC_COLLATE=C sort -k1,1 inputfile
or
LC_COLLATE=C sort inputfile
My sort (on Linux) can do it. I don't know how portable it is. In BASH:
sort -k1 -t$'\t'
-k1
gives the id of the key column(s), -t
specifies the field separator.
You probably need to set the locale for the sort, and set it to the C locale:
LANG=C sort -k1 data.file
I believe you are looking for:
sort -k 1 filetosort
Tab is a whitespace separator, which is the default separator for sort.
精彩评论