I just stumbled upon the following article:
http://www.josscrowcroft.com/2011/code/utf-8-multibyte-characters-in-url-parameters-%E2%9C%93/
The article talks about using UTF-8 characters in URL's.
I would like to know whether it is safe to use it.
I have basically the same setup (browser + OS) as the guy who wrote the article. So I can't really test it.
So... is it safe to use UTF-8 characters in URL's?
And the bonus question: If it's safe how come开发者_开发百科 not many websites use it?
Unicode characters in the url (I'm not talking about the domainname) are safe to use. There is no security risk, if you use them on your site. (There are some risks to the end user if he visits a fraudulent site using unicode on the page as Oded said).
The only real problem is how older browsers (and OSs) show them. Browsers not supporting them will show those ugly percentage encoded chars in the url. You probably also have to percentage-encode the urls inside the html in case older browsers don't encode it for you and the user can't follow the link (which is bad). Modern browsers show the decoded url in the addressbar, but use the encoded version to send the request, so the user always sees the pretty unicode characters.
It is possible with any browser that supports IDN.
However, IDN is not well supported on the different web servers and the proxies and other internet infrastructure, hence most sites can't support it and be sure people can get to them...
And, as @Rook alludes to, there are still security issues with using UTF-8 this way (XSS for example).
UTF-8 has still got a long long way to go ... definitely not safe.
And culturally, I like it that way. I cannot imagine writing/remembering URL address made from Chinese letters, or they doing the same.
精彩评论