[TriLUG] sort mystery

Kevin Hunter hunteke at earlham.edu
Fri Apr 30 21:26:42 EDT 2010


At 6:37pm -0400 Fri, 30 Apr 2010, Michael Hrivnak wrote:
> [What's up with the sort order of ls?!!  Vim does it right!]

Collation.  Before you can sort -- anything -- you need to have a 
definition of it means to "be sorted".  In English, we tend to take our 
alphabet of ABCs to be the canonical sort order.

However, sort order is arbitrary, and is in fact different depending on 
where you are and with what context you're involved.  To pick an example 
close to home, what's the difference between 'A' and 'a'?  Should 'A' 
come before 'a'?  Should they be treated equally?  How should numbers 
enter the fray?  Do they sort before or after the alphabet?

Without boring you with the nitty gritty (if you're curious, GIYF), the 
man page suggests that unless told otherwise, ls sorts "alphabetically" 
(according to the default locale).  I'll get corrected, I'm sure, but I 
believe most recent distros ship with a UTF8 locale by default.  Thus, 
you can fiddle with this:

$ LC_COLLATE="C" ls           # your expected order
$ LC_COLLATE="en_US.utf8" ls  # "weird" sort ordering
$ LC_COLLATE="POSIX" ls       # your expected order

$ man locale     # for more info

The sort command will also respect the collation you choose.  Vim's 
sort, however, is a little more fickle, I believe.  Whether or not Vim's 
sort function respects your LOCALE settings is going to be dependent on 
the library against which it was compiled.

Hope this helps,

Kevin



More information about the TriLUG mailing list