Take the 2-minute tour ×
Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems.. It's 100% free, no registration required.

I have trouble understanding unix sort. Consider the following file (tab separated)

aa  ~ a1
aa  B
b   A
b   ~ e
bb  B
bb  ~ B

When calling:

cat tmp2 | sort -t $'\t' -k1,2

I get

aa  ~ a1
aa  B
b   A
bb  B
bb  ~ B
b   ~ e

As far as I understand, -t $'\t' says to consider the separator to be a tab instead of a white space and -k1,2 says to sort by the first column and, if two rows have the same fist column, then by the second one. But in that case, shouldn't my last 'b' appear in the fourth row?

share|improve this question

1 Answer 1

up vote 3 down vote accepted

No, -k1,2 says to sort on the portion of the line that starts at the beginning of the first field and ends at the end of the second field.

To sort on the first field and then on the second, it's:

sort -k1,1 -k2,2
share|improve this answer
    
(edit I could not put each character to a different line in a comment) So why then 'a~' 'a' 'a2' 'aa' 'aa3' 'aa~' will be sorted as 'cat tmp2 | sort -k1,1' 'a' 'a~' 'a2' 'aa' 'aa~' 'aa3' ? Since the '~' is after any other character in the ascii table, 'aa~' should be sorted after 'aa3' for example? –  giulio Feb 7 at 21:00
    
@giulio. The ASCII table is only relevant for sorting in the C/POSIX locale. String comparison in other locales is a lot more complicated and tries and work the same as in natural languages. For instance spaces are ignored in a first instance in most locales which is why "b c" sorts after "bb". –  Stéphane Chazelas Feb 7 at 21:05

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.