When using urlencode, soemtimes space is encoded into +; sometimes it is encoded into %20? I am wo开发者_Python百科ndering which one is the standard in HTML?
Neither. Query encoding is part of the URI/URN standard, and it depends entirely on the server how it wants the result. Some use %-encoding so they are able on the safe side parsing wise (readability doesn't matter), some use +, some use - (ie. stack overflow).
The reason for the encoding is simple, URI/URNs don't allow spaces (and other special chars). However, the standard doesn't define a way how it is supposed to be done.
The URI specification requires any invalid character being encoded using the Percent-Encoding. And since the space is invalid in URIs, it needs to be encoded with %20
.
Besides that, the HTML 4 specified the special encoding application/x-www-form-urlencoded for forms that is based on the Percent-Encoding but encodes the space with +
instead of %20
.
PHP has two different URI encoding functions:
rawurlencode
that encodes according to the URI specification (without taking the component context into account), andurlencode
that encodes according to application/x-www-form-urlencoded.
urlencode encodes space to +
, while rawurlencode encodes it to %20
.
urlencode
This differs from the » RFC 3986 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.
According to rawurlencode
Returns a string in which all non-alphanumeric characters except -_.~ have been replaced with a percent (%) sign followed by two hex digits. This is the encoding described in » RFC 3986 for protecting literal characters from being interpreted as special URL delimiters, and for protecting URLs from being mangled by transmission media with character conversions (like some email systems).
By standard, the '+' is a reserved character (RFC 3986) in URI. URI has 2 sub-spaces: URL and URN. 'http:' is an implementation of URL scheme and usage of RFC 3986 reserved characters is http specific. The '?' is another reserved character in URI, which is used to mark starting of query string in http URL. Likewise, the '+' reserved character is used to encode space. The percent encoding (%20) is a standard way of encoding space and would work in any URI (regardless of implementation).
Please also see when to encode space to plus (+) and when to %20 ?
There is no urlencode
in HTML, so that’s not defined.
It’s a matter of design and implementation on what spaces are converted to.
Spaces are not URI-valid, thus they need to be converted. %20
is an URI-valid encoding for spaces. +
is a replacement for spaces for better readability.
精彩评论