I saw some code like this: Why use MultiByteToWideChar and WideCharToMultiByte at the same time?
char szLine[MAX_LENGTH_STRING] = {0}
... //some operate to szLine
char *szUtf8string;
wchar_t *szUnicodeString;
int size;
int room;
size = strlen(szLine)+1;
room = MultiByteToWideChar(CP_ACP, 0, szLine, -1, NULL, 0);
szUnicodeString = (wchar_t*) malloc((sizeof(wchar_t))*room);
MultiByteToWideChar(CP_ACP, 0, szLine, -1, szUnicodeString, room);
room = WideCharToMultiByte(CP_UTF8, 0, szUnicodeString, -1, NULL, 0, NULL, NULL);
szUtf8string = (char*) malloc(room);
WideCharToMultiByte(C开发者_高级运维P_UTF8, 0, szUnicodeString, -1, szUtf8string, room, NULL, NULL);
This code fragment first converts the string from the a multibyte representation using the system default code page to Unicode, then converts it to the UTF-8 multibyte representation. Thus, it converts text in the default code page to UTF-8 representation.
The code is fragile, in that it assumes the UTF-8 version will only double in size (this probably works most of the time, but the worse case is that a single byte in the default code page may map to 4 bytes in UTF-8).
精彩评论