I've recently been learning Haskell, and I noticed that the String
type (or [Char]
) can be ordered. For example, this is valid:
ghci> "foo" > "bar"
True
gh开发者_运维问答ci> "?<>!" `compare` "[&*}"
LT
How does Haskell order String
s, and when would this functionality be useful?
How does Haskell order Strings, and when would this functionality be useful?
Firstly, Char is an instance of Ord, given by equality primitives on the underlying primitive char type on the machine.
instance Ord Char where
(C# c1) > (C# c2) = c1 `gtChar#` c2
(C# c1) >= (C# c2) = c1 `geChar#` c2
(C# c1) <= (C# c2) = c1 `leChar#` c2
(C# c1) < (C# c2) = c1 `ltChar#` c2
then String is defined as a [Char]
(list of Char), and lists in general have an ordering, if their elements have an ordering:
instance (Ord a) => Ord [a] where
compare [] [] = EQ
compare [] (_:_) = LT
compare (_:_) [] = GT
compare (x:xs) (y:ys) = case compare x y of
EQ -> compare xs ys
other -> other
and that's it. Any list whose elements have any ordering will in turn by ordered.
Since Char is ordered by its underlying representation as a bit pattern, and lists are given by element-wise ordering of the lists, you thus see the behavior for String.
when would this functionality be useful?
For inserting Strings into data structures that are polymorphic, but require an Ordering method. The most notable are Set and Map.
How does Haskell order Strings,
Here's some definitions from the Haskell Prelude.
Strings are just lists of characters:
type String = [Char]
Characters are ordered by their Unicode codepoint:
instance Ord Char where
c <= c' = fromEnum c <= fromEnum c'
And lists are compared using lexicographical order (implicitly by the structure of a list and the definition of automatically-derived Ord
):
data [a] = [] | a : [a] deriving Ord -- not actually valid Haskell :)
instance Ord a => Ord [a]
and when would this functionality be useful?
You need an instance of Ord
to use things like Map
or Set
.
2 lists are compared by lexicographical order (i.e. from left to right), provided each element is an instance of the Ord typeclass. Strings can be ordered because Char can be ordered.
Try this:
[1,2,3] < [2,3,4,5]
I would assume that it's lexicographical order, over whatever character encoding is being used. (In other words, "alphabetical" order with a, for ASCII or other single-byte encodings, 256-character alphabet.)
精彩评论