开发者

Non-english alpha-numerics in a text file

开发者 https://www.devze.com 2023-01-02 03:32 出处:网络
C# WinForm application EDIT:It appears there\'s concern about foreign language compatibility. This is a non-issue.

C# WinForm application

EDIT: It appears there's concern about foreign language compatibility.

This is a non-issue. The card game I'm making this utility for is primarily in English. In the future I may support other languages, but everything will still be keyed off the English names, which are a primary key in both the program and the rules of the game.

I can simply add additional tables with the English name, followed by the translated text, and everything should be fine.

.

Part of my program reads input from a text file containing names, and compares it to another list of names. Sometimes these names have non-english letters, particularly accented "o" and the Latin AE in the input file.

When this text input is compared to names, those non-english characters are causing problems. I'd like to find a way to overlay these characters with the english counterpart in most cases, such as "[accented o]" -> "o"

.

I'm perfectly content to code a find/replace table (I only expect 12-30 problem characters), but I've got some roadblocks.

1) Hardcoding the find/replace table (in the ".cs" file) gives me errors, because the compiler doesn't like the characters.

Anyone know a trick to fix this, or do I just have to create a Find/Replace text file that would be read before this process?

2) Identifying the letters is frustrating, but I'll only reach the replace logic if a match isn't found. This occurs when the non-english characters开发者_StackOverflow cause a mismatch, or it isn't in the list yet.

I'm not too worried about the inefficiency of a char-by-char check of each unmatched string, as this is a manual update process triggered every three months. Presumably getting down to the Bianary-code level of a single character should work, but I haven't gotten this to work.

3) The aforementioned [AE] character is used often, and it would be nice to at least allow the use of this character within the program, as I don't intend to replace it like I do the others. I've loaded [AE] characters into my database with no problems, and searches using "Ae," "AE," and "[AE]" have posed no problem at the SQL-level, so I'm fine with that functionality.

It's just that searching for other non-english characters is less intuitive.

.

So there's my problem, which is actually more of a nuisance than anything serious. Still, any help or advice would be greatly appreciated.


Are you sure these names aren't meant to be different? Are you sure that you want all of "è", "é", "ê", and "ë" to mean the same thing?

Especially in "foreign" names, characters with different diacritical marks are likely intended to be different. After all, to the people whose names those are, these characters are not foreign.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号