开发者

What's wrong with my character set (Win32 API)

开发者 https://www.devze.com 2023-01-30 14:50 出处:网络
I\'m currently learning Win32 using this tutorial, and I have a hard time with my displayed characters.

I'm currently learning Win32 using this tutorial, and I have a hard time with my displayed characters.

Take for instance this piece of code which a开发者_C百科dds a menu to my window upon creation:

    case WM_CREATE: {
            HMENU hMenu, hSubMenu;
            HICON hIcon, hIconSm;

            hMenu = CreateMenu();
            hSubMenu = CreatePopupMenu();

            AppendMenu(hSubMenu, MF_STRING, ID_FILE_EXIT, "Exit");
            AppendMenu(hMenu, MF_STRING | MF_POPUP, (UINT)hSubMenu, "File");

            hSubMenu = CreatePopupMenu();
            AppendMenu(hSubMenu, MF_STRING, ID_STUFF_GO, "&GO");
            AppendMenu(hMenu, MF_STRING | MF_POPUP, (UINT)hSubMenu, "&Stuff");

            SetMenu(hwnd, hMenu);

            hIcon = LoadImage(NULL, "Stuff.ico", IMAGE_ICON, 32, 32, LR_LOADFROMFILE);

            if (hIcon)
                SendMessage(hwnd, WM_SETICON, ICON_BIG, (LPARAM)hIcon);
            else
                MessageBox(hwnd, "Could not load large icon!", "Load Error", MB_OK | MB_ICONERROR);

            hIconSm = LoadImage(NULL, "Stuff.ico", IMAGE_ICON, 16, 16, LR_LOADFROMFILE);

            if(hIconSm)
                SendMessage(hwnd, WM_SETICON, ICON_SMALL, (LPARAM)hIconSm);
            else
                MessageBox(hwnd, "Could not load small icon!", "Load Error", MB_OK | MB_ICONERROR);
        }
        break;

That is inside of a switch block within my WndProc function that handles the Windows Messages received from the Message Loop.

Each string that is to be displayed:

"Exit"
"File"
"&GO"
"&Stuff"

Is unreadable at runtime as they are displayed as little squares, just like the codepage was not the right one or something like that. When I run the tutorial, all the strings are correctly displayed. I tend to stick exactly to what the tutorial says to help me get the thing right, and its pedagogy is good. Anyway!...

I'm using:

  1. Microsoft Visual Studio 2008 Team System;
  2. Microsoft Windows Server 2003 using RDP;
  3. Local OS is Windows Vista Ultimate.

Anyone has a clue about it?


You have an issue with Unicode vs. Windows ANSI character encoding. Historically, Windows used an extended ASCII that they mis-named ANSI. This brought with it the need for code pages because even an 8-bit character does not provide enough code points to represent all of the European writing systems, let alone the rest of the world. When Win32 was developed, they settled on Unicode as the preferred character set. (Actually, they settled on the UTF-16LE encoding of the Unicode Character Set, but that detail isn't entirely relevant right now.) However, there was too much existing code to consider requiring that porting from Win16 to Win32 would also need to change the character encoding of all strings.

Their solution was clever (some have argued that it was too clever). Every Win32 API entry point that takes a string comes in two flavors. The first flavor takes ANSI strings and handles the conversion to UTF-16LE internally. The second (and now preferred) flavor takes UTF-16LE strings directly. They also conspired with the Visual C team to define wchar_t as a 16-bit type, and to assure that L"" string literals use the mapping from ASCII text to UTF-16LE.

To make porting from existing Win16 code easy, the MessageBox function and every other Win32 API that takes strings is mapped at compile time by a macro to either MessageBoxA or MessageBoxW depending on whether or not the preprocessor symbol UNICODE is defined.

This mapping can't fix the string literals, so they also introduced a macro to designate string literals that are to be either narrow or wide depending on UNICODE, and a matching typedef so that variables can be declared to hold pointers to them.

So, for best portability to and from Win16, you would #include <tchar.h>, use TCHAR in place of char or wchar_t, wrap all text-containing string literals in the _T() macro, and call on the Win32 APIs with the non-suffixed names like MessageBox.

This isn't a perfect solution, however. The moment your code needs to manipulate or compute a string that will be displayed to a user, you discover that it is difficult to write code that is perfectly portable in the TCHAR regime. There are replacements for all of the standard string functions that manipulate TCHARs, but it is difficult to verify with automated testing that you correctly used them such that the code would compile and work correctly both with and without UNICODE defined.

If writing new Win32 code today, my advice would be to define UNICODE in the project, add a check that it is really defined in a common header file, and use L"" strings and the W flavors of all wrapped calls explicitly.

Finally, this whole essay is prompted by your code that displays the missing character glyph (the empty square box character is the glyph that displays when a font is missing a particular character). That is occuring because your ASCII string literals are being interpreted by the Win32 code as if they were UTF-16LE, so that the string "Exit" would be taken to be two Unicode characters, U+7845 and U+7469, which are both Unified Han ideographs. Unless you have Han fonts installed, both are highly unlikely to be present in any font on your system so you get the missing character glyph instead.

This is happening because you are mixing the wrapper macro with an ASCII string literal. You have:

AppendMenu(hSubMenu, MF_STRING, ID_FILE_EXIT, "Exit");

but you should have one of the following:

AppendMenu(hSubMenu, MF_STRING, ID_FILE_EXIT, _T("Exit"));
AppendMenuA(hSubMenu, MF_STRING, ID_FILE_EXIT, "Exit");
AppendMenuW(hSubMenu, MF_STRING, ID_FILE_EXIT, L"Exit");

where I prefer to recommend the last example.


With all due strangeness about the little squares, your code is wrong. It is not Unicode compliant. You should prefix all strings with L (as in L"string") and change compile settings to Unicode (this makes windows functions accept UTF-16 encoding). This is windows native encoding, which is how text was meant to be done on windows.

An alternative approach would be using wide APIs and converting to UTF-16 when calling the APIs. It is described in http://utf8everywhere.org.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号