开发者

utf-8 problem with rawurldecode and browser address bar issue

开发者 https://www.devze.com 2023-03-31 12:54 出处:网络
I have some problems with rawurldecode with Turkish character set. I have a turkish word (yeşil means green) which needs to be passed as GET parameter.

I have some problems with rawurldecode with Turkish character set.

I have a turkish word (yeşil means green) which needs to be passed as GET parameter.

Here is my generated link.

search.php?renk=ye%C5%9Fil

When I clicked this link browser address bar shows it like that. (It is decoded properly)

search.php?renk=yeşil

And the problem starts from here. When I modify url in browser address bar (like adding extra get parameter) and hit enter browser modifies keyword and开发者_如何学Go it generates url like below.

search.php?renk=ye%FEil

After this point server side code doesn't handle parameter and generates wrong results. Is there any standard way of avoiding this?

Thanks.


Looks like your browser converts link to iso-8859-9 encoding, or something similar. %FE is urlencoded ş from iso-8859-9 encoding.

I've tried iconv("iso8859-9", "utf-8", rawurldecode("search.php?renk=ye%FEil")) and it worked.


Urls are always using US-Ascii !

See RFC: http://www.ietf.org/rfc/rfc1738.txt

No corresponding graphic US-ASCII:

URLs are written only with the graphic printable characters of the US-ASCII coded character set. The octets 80-FF hexadecimal are not
used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
control characters; these must be encoded.

now you are running into lots of problems. if you paste a url into the browser, the url field sometimes relies on OS locales. the browser may convert it. sometimes firewalls and proxys may filter urls!

the next important question is: how does the web server interpret those high chars. how does it transfer it to php (depending on gateway). php decodes urls automatically, what will happen there with you high chars? php doesn't take care about encoding.

in my opinion the is only one solution to be save. encode your unicodestring into a base64encoded string. this will be save within the url - because it is ascii.

within your script you can decode it and you have it back in your encoding you set before.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号