I want to ask a question about the multipart/form-data
. In the HTTP header, I find that the Content-Type: multipart/form-data; boundary=???
.
Is the ???
free to be defined b开发者_如何学JAVAy the user? Or is it generated from the HTML? Is it possible for me to define the ??? = abcdefg
?
Is the
???
free to be defined by the user?
Yes.
or is it supplied by the HTML?
No. HTML has nothing to do with that. Read below.
Is it possible for me to define the
???
asabcdefg
?
Yes.
If you want to send the following data to the web server:
name = John
age = 12
using application/x-www-form-urlencoded
would be like this:
name=John&age=12
As you can see, the server knows that parameters are separated by an ampersand &
. If &
is required for a parameter value then it must be encoded.
So how does the server know where a parameter value starts and ends when it receives an HTTP request using multipart/form-data
?
Using the boundary, similar to &
.
For example:
--XXX
Content-Disposition: form-data; name="name"
John
--XXX
Content-Disposition: form-data; name="age"
12
--XXX--
In that case, the boundary value is XXX
. You specify it in the Content-Type
header so that the server knows how to split the data it receives.
So you need to:
Use a value that won't appear in the HTTP data sent to the server.
Be consistent and use the same value everywhere in the request message.
The answer to substance of the question is yes. You can use an arbitrary value for the boundary
parameter as long as it is less than 70 bytes long and only contains 7-bit US-ASCII
(printable) characters.
If you use one of multipart/*
content types, you are actually required to specify the boundary
parameter in the Content-Type
header. Otherwise, in the case of an HTTP request, the server will be unable to parse the payload.
Unless you are absolutely certain that only the US-ASCII
character set will be used in its payload, you may want to add a Content-Type
header to each part, with the charset
parameter set to UTF-8
.
A few relevant excerpts from the RFC2046:
4.1. Text Media Type
A "charset" parameter may be used to indicate the character set of the body text for "text" subtypes, notably including the subtype "text/plain", which is a generic subtype for plain text.
4.1.2. Charset Parameter
A critical parameter that may be specified in the Content-Type field for "text/plain" data is the character set.
Unlike some other parameter values, the values of the charset parameter are NOT case sensitive. The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII.
5.1. Multipart Media Type
As stated in the definition of the Content-Transfer-Encoding field [RFC 2045], no encoding other than "7bit", "8bit", or "binary" is permitted for entities of type "multipart". The "multipart" boundary delimiters and header fields are always represented as 7bit US-ASCII in any case (though the header fields may encode non-US-ASCII header text as per RFC 2047) and data within the body parts can be encoded on a part-by-part basis, with Content-Transfer-Encoding fields for each appropriate body part.
The Content-Type field for multipart entities requires one parameter, "boundary". The boundary delimiter line is then defined as a line consisting entirely of two hyphen characters ("-", decimal value 45) followed by the boundary parameter value from the Content-Type header field, optional linear whitespace, and a terminating CRLF.
Boundary delimiters must not appear within the encapsulated material, and must be no longer than 70 characters, not counting the two leading hyphens.
The boundary delimiter line following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter line is identical to the previous delimiter lines, with the addition of two more hyphens after the boundary parameter value.
Here is an example using an arbitrary boundary:
Content-Type: multipart/form-data; boundary="yet another boundary"
--yet another boundary
Content-Disposition: form-data; name="foo"
bar
--yet another boundary
Content-Disposition: form-data; name="baz"
quux
--yet another boundary
Content-Disposition: form-data; name="feels"
Content-Type: text/plain; charset=utf-8
精彩评论