开发者

Does STL sort support UTF8?

开发者 https://www.devze.com 2023-03-18 19:19 出处:网络
Doe开发者_运维问答s the STL sort function support alphabetical sorting of names which have UTF-8 characters in them? Say names from German/French language?That entirely depends on how you store the UT

Doe开发者_运维问答s the STL sort function support alphabetical sorting of names which have UTF-8 characters in them? Say names from German/French language?


That entirely depends on how you store the UTF-8 characters and how your comparer looks like. The sort function is completely agnostic of the elements it sorts.

But you probably mean “… when stored in a char array” and then the answer is no since the chars will store individual bytes of a given UTF-8 character, instead of the logical character. The sort function sorts elements delimited by iterators. sort works only if the iterators / the elements they refer to are aware of the data that they contain. This isn’t the case for an array of chars that encode UTF-8.

The “correct” solution here is to parse the UTF-8 input into an array of proper (normalised) Unicode code points, sort those, and translate back to UTF-8.


All that is required is the proper comparison function. You can probably find one in ICU - International Components for Unicode . Look specifically at Collation.


c++0x supports UTF

This has nothing to do with STL.


I assume that you refer to the Standard Template Library - and the answer is no.

None of the standard libraries has a text string type. There are char arrays, - but that's just a vector of bytes. There is std::string but that's a string of bytes (or 16bit words, or anything like that) basically. It has no notion of characters, let alone encodings.

0

精彩评论

暂无评论...
验证码 换一张
取 消