开发者

PHP: What is that character encoding of this string?

开发者 https://www.devze.com 2023-02-23 02:18 出处:网络
In PHP开发者_如何学Go, i have the following string: =CA=CC=D1=C8=C9 what is its character encoding?It does not make sense to have a string without knowing what encoding it uses.

In PHP开发者_如何学Go, i have the following string: =CA=CC=D1=C8=C9

what is its character encoding?


It does not make sense to have a string without knowing what encoding it uses.

Those 5 bytes mean different things in different encodings.

  • In UTF-8, it's invalid. All lead bytes and no trail bytes.
  • In ISO-8859-1 and windows-1252, it's the string ÊÌÑÈÉ.
  • According to chardet, it's in KOI8-R, and decodes to йляхи


The answer and comments that you got assumed that you knew already that the transportation encoding was "quoted-printable" ... decoding using that, "=CA=CC=D1=C8=C9" becomes "\xCA\xCC\xD1\xC8\xC9" (which is NOT UTF-8, as you asked for in a comment) ... and they concentrated on what encoding might reasonably be used to produce Unicode out of that. To arrive at UTF-8, you need two more steps: decode "\xCA\xCC\xD1\xC8\xC9" into Unicode (using an encoding appropriate to Arabic text) and then encode into UTF-8.


It is called quoted printable

I can deceode it using :

quoted_printable_decode($string);
0

精彩评论

暂无评论...
验证码 换一张
取 消