开发者

Size of wchar_t for unicode encoding

开发者 https://www.devze.com 2023-02-17 12:40 出处:网络
is there 32-bit wide character for encoding UTF-32 strings? I\'d like to do it via std::wstring w开发者_运维百科hich apparently showing me size of a wide character is 16 bits on windows platform.You w

is there 32-bit wide character for encoding UTF-32 strings? I'd like to do it via std::wstring w开发者_运维百科hich apparently showing me size of a wide character is 16 bits on windows platform.


You won't be able to do it with std::wstring on many platforms because it will have 16 bit elements.

Instead you should use std::basic_string<char32_t>, but this requires a compiler with some C++0x support.


The size of wchar_t is platform-dependent and it is independent of UTF-8, UTF-16, and UTF-32 (it can be used to represent unicode data, but there is nothing that says that it represents that).

I strongly recommend using UTF-8 with std::string for internal string representation, and using established libraries such as ICU for complex manipulation and conversion tasks involving unicode.


Just use typedef!

It would look something like this:

typedef int char_32;

And use it like this:

char_32 myChar;

or as a c-string:

char_32* string_of_32_bit_char = "Hello World";


The modern answer for this is to use char32_t (c++11) which can be used with std::u32string. However, in reality, you should just use std::string with a encoding like UTF-8. Note that the old answer to char32_t would be using templates or macros to determine which unsigned integral type has size of 4 bytes, and use that.

0

精彩评论

暂无评论...
验证码 换一张
取 消