开发者

How to post UTF-8 text from MySQL to Twitter successfully

开发者 https://www.devze.com 2023-02-05 20:54 出处:网络
I have some text in UTF-8. I put it into a MySQL database, collation utf8_general_ci and then I\'ve been auto-posting it to Twitter via Net::Twitter.

I have some text in UTF-8. I put it into a MySQL database, collation utf8_general_ci and then I've been auto-posting it to Twitter via Net::Twitter.

But when I post it, even though Twitter itself seems to be expecting UTF-8, going by the content-type 开发者_JAVA百科in their input pages, I'm getting those artefacts you get when UTF-8 text is misinterpreted: é comes out as é for instance.

So ... at what point is this going wrong? How can I ensure it makes the trip undamaged?

  • Set my script to treat all text as UTF-8 somehow?
  • Make sure I extract it from the database in UTF-8?
  • Tell Net::Twitter that it's posting in UTF-8?


You probably need to enable the mysql_enable_utf8 attribute when opening your db connection:

my $dbh = DBI->connect("DBI:mysql:database=test;host=localhost",
                       "user", "password",
                       { mysql_enable_utf8 => 1});

This will tell Perl that strings retrieved from the database are UTF-8 encoded.


My guess would be the encoding of the database connection, which often is iso-8859-1 by default. That would explain the é - it's a two-byte UTF-8 character displayed in single-byte iso-8859-1.

Does sending a query with SET NAMES utf8; after connecting help? (Or whatever specific command Perl's mySQL client library might have for setting the connection character set.)


I found the answer here.

Instead of

$r = $nt->update ( { 'status' => $message } );

Try

use Encode;
$r = $nt->update ( { 'status' => decode( 'utf-8' ,  $message ) } ) ;
0

精彩评论

暂无评论...
验证码 换一张
取 消