According to RFC 2109, 2965 cookie's value can be either HTTP token or quoted string, a开发者_运维问答nd token can't include non-ASCII characters.
- Cookie's RFC 2109 and RFC2965
- HTTP's RFC 2068 and 2616 token definition: https://www.rfc-editor.org/rfc/rfc2616#page-16
However I had found that Firefox browser (3.0.6) sends cookies with utf-8 string as-is and three web servers I tested (apache2, lighttpd, nginx) pass this string as-is to the application.
For example, raw request from browser:
$ nc -l -p 8080
GET /hello HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.9) Gecko/2009050519 Firefox/2.0.0.13 (Debian-3.0.6-1)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: windows-1255,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: wikipp=1234; wikipp_username=ארתיום
Cache-Control: max-age=0
And raw response of apache, nginx and lighttpd HTTP_COOKIE
CGI variable:
wikipp=1234; wikipp_username=ארתיום
What do I miss?
RFC 2109 (Feb 1997) is obsolete and was superseded by RFC 2965 (Oct 2000), according to the Internet Official Protocol Standards (STD 1, RFC 5000).
You may also be interested in a more recent March 7, 2010 draft to revise 2965.
The only definition of a token in 2965 is:
informally, a sequence of non-special, non-white space characters
I wouldn't consider the entirety of UTF-8 to be disallowed by that definition - only characters that could be mistaken as control/syntax characters.
RFC 2965 has been obsoleted by RFC 6265. According to this rfc:
The cookie name has to be a token, which consists of printable ascii chars without ( ) < > @ ,; : \ " / [ ] ? = { } SPACE TAB
The cookie value consists of printable ascii chars without SPACE " , ; \ with the possibility of being surrounded by quotes
精彩评论