What are the repercussions for using latin1 instead of utf8 for MySQL tables created for OAuth usage?_问答_开发者

What are the repercussions for using latin1 instead of utf8 for MySQL tables created for OAuth usage?

开发者 https://www.devze.com 2023-02-27 02:28 出处：网络

I am in the midst of getting OAuth support set up on a shared server.The server side PHP OAuth library I am trying to install is this one:

I am in the midst of getting OAuth support set up on a shared server. The server side PHP OAuth library I am trying to install is this one:

http://code.google.com/p/oauth-php/downloads/list

And I am following the installation notes found here:

http://code.google.com/p/oauth-php/wiki/ConsumerHowTo

In the notes there was a tip to use an SQL script found in install package to set up the tables and databases for you. When I tried executing the script via the Import (SQL) feature found in phpMyAdmin, I got a "Key Too Long" error on one of the tables. In other words, I ran smack into the maximum key length limitation found when using MySQL/InnoDB tables.

To get around this problem I replaced all instances of "charset=utf8" for "charset=latin1" since utf8 requires 3 bytes per character and latin1 is 1 byte per character. The script executed fine and all tables were created correctly.

As far as I can see, all the fields used in the tables don't require support of multi-byte international characters. The only way I could see a problem developing is if one of the OAuth connected services I access use international characters in their consumer key开发者_如何学Python or secret, and I have not run into that situation at all so far.

Can anyone tell me if this workaround will bite me in the backside at anytime and where that might be? Also, if anyone has a better solution to fixing the "key too long" problem without sacrificing the use of the utf8 character set I'd like to know about it.

Technically, all strings must first be utf8-encoded before urlencoding. See the OAuth 1.0 spec section 5.1: All parameter names and values are escaped using the [RFC3986] percent-encoding (%xx) mechanism. Characters not in the unreserved character set ([RFC3986] section 2.3) MUST be encoded. Characters in the unreserved character set MUST NOT be encoded. Hexadecimal characters in encodings MUST be upper case. Text names and values MUST be encoded as UTF-8 octets before percent-encoding them per [RFC3629].

So if you have any Latin-1 characters that aren't also ASCII (bit 7=0) then you'll have to re-encode the strings as UTF-8 after pulling them from the db and before using them in the OAuth protocol.