开发者

Culture specific characters to nice URL format

开发者 https://www.devze.com 2022-12-26 11:34 出处:网络
I need some functionality t开发者_运维技巧o make the following string in a url-friendly format:

I need some functionality t开发者_运维技巧o make the following string in a url-friendly format: "knæ som gør" should be "kna-som-gor"

That is, replacing culture specific characters to characters that can be used in urls.

Using .Net and C#

Please help me :)

/Andreas


Don't complicate things. :)

Either use a regexp, or simply use String.Replace.


You can find a solution that removes diacritics here: How do I remove diacritics (accents) from a string in .NET?. This solution does not help you with æ or ø, though.

Maybe that removes enough of your special characters that the rest can be translated using simple replacing?

If "url-friendly" does not mean pretty, you could also use HttpUtility.UrlEncode, which produces "kn%c3%a6+som+g%c3%b8r".


Edit: Added possible solution (end of post).

I had a very similar problem, albeit for file names rather than URLs. The main problem seems to be that there is no standard way to ask for the "best ASCII replacement for ø", so even if you can locate all the unwanted characters it is hard to automate which replacement to insert.

I posted quite a bit of code that might be helpful. See this StackOverflow question for details.

Edit: I think the solution to this problem lies with StringInfo, which allows you to iterate through the sub-characters (Unicode surrogates or combining characters) in a string. This should make it possible to detect and convert something like å (which can be encoded in Unicode as either A-WITH-RING or RINGED-A; filter out the decorator and keep the part that is a normal character).

0

精彩评论

暂无评论...
验证码 换一张
取 消