I don't know why this code doesn't work:
use strict;
use warnings;
use Encode qw/decode/;
my $entity_unicode = "00A0";
$entity_unicode = decode("UTF-16", pack('H4', $entity_unicode));
print $entity_unicode, "\n";
It prints out: "UTF-16:Unrecognised BOM a0 at /usr/lib/per开发者_JS百科l/5.10/Encode.pm line 174.".
Without a BOM (U+FEFF) at the start of the string to decode, there no way to know if 00 A0 is U+00A0 (UTF-16be) or U+0A00 (UTF-16le, used by Windows). One must specify the exact encoding when the BOM is absent. In this case, that's UTF-16be.
精彩评论