I'm looking for a sim开发者_运维技巧ple way to HTML encode a string/object in Perl. The fewer additional packages used the better.
HTML::Entities is your friend here.
use HTML::Entities;
my $encoded = encode_entities( "foo & bar & <baz>" );
When this question was first answered, HTML::Entities was the module most people probably used. It's pure Perl and by default will escape the HTML reserved characters ><'"&
and wide characters.
Recently, HTML::Escape showed up. It has both XS and pure Perl. If you're using the XS version, it's about ten times faster than HTML::Entities. However, it only escapes ><'"&
and has no way to change the defaults. Here's the difference with the XS version:
Benchmark: timing 10000 iterations of html_entities, html_escape...
html_entities: 14 wallclock secs (14.09 usr + 0.01 sys = 14.10 CPU) @ 709.22/s (n=10000)
html_escape: 1 wallclock secs ( 0.68 usr + 0.00 sys = 0.68 CPU) @ 14705.88/s (n=10000)
And here's the fair fight with pure Perl versions on each side:
Benchmark: timing 10000 iterations of html_entities, html_escape...
html_entities: 14 wallclock secs (13.79 usr + 0.01 sys = 13.80 CPU) @ 724.64/s (n=10000)
html_escape: 7 wallclock secs ( 7.57 usr + 0.01 sys = 7.58 CPU) @ 1319.26/s (n=10000)
You can get these benchmarks in Surveyor::Benchmark::HTMLEntities. I explain how I distribute benchmarks using Surveyor::App.
Which do you need to encode, a string or an object? If it's just a string, then you should just have to worry about encoding issues such as UTF-8, and CGI::escape will probably do the trick for you. If it's an object, you'll need to serialize it first, which opens up a whole new set of issues, but you might want to consider JSON-encoding it.
PS. Although since I can't find any recent documentation on this method (it's actually imported from CGI::Util and is marked as "internal"), you should probably use escapeHTML() as daxim points out in his comment: http://search.cpan.org/perldoc?CGI#AUTOESCAPING_HTML
精彩评论