开发者

Can I send a POST form in an encoding other than of its body?

开发者 https://www.devze.com 2022-12-29 21:05 出处:网络
I have an HTML page that looks like: <HTML> <meta http-equiv=\'Content-Type\' content=\'text/html; charset=gb2312\'>

I have an HTML page that looks like:

<HTML>
<meta http-equiv='Content-Type' content='text/html; charset=gb2312'>
<BODY onload='document.forms[0].submit();'>
<form name="form" method="post" action="/path/to/some/servlet">
<input type="hidden" name="username" value="麗安"> <!-- UTF-8 characters -->
</form>
</BODY>
</HTML>

As you can see, the content of this page is UTF-8, but I need to send it with GB2312 character encoding, as the servlet that I am sending this page to expects from me GB2312.

Is this a valid scenario? Because in the servlet, I couldn't retrieve these Chinese characters back using a filter that sets the character encoding to GB2312!

I've created a sample Servlet:

package org.daz;

import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class EncodingServlet extends HttpServlet {
    private static final long serialVersionUID = 1L;
    private static final String ENCODING = "GB2312";

    protected void doPost(HttpServletRequest request, HttpServletResponse response) 
        throws ServletException, IOException {

        setCharacterEncoding(request, response);

        String username = request.getParameter("username");
        System.out.println(username);

    }

    private void setCharacterEncoding(Http开发者_如何学PythonServletRequest request, HttpServletResponse response)throws IOException{
        request.setCharacterEncoding(ENCODING);
        response.setCharacterEncoding(ENCODING);
    }

}

The output is: 楹��


This is not possible. You'll need to use GB2312 characters from the beginning on instead, or to change the entire application to use UTF-8 only. You can't convert from character encoding X to character encoding Y that way. Any character outside the ASCII range would possibly get corrupted.

The form's accept-charset attribute as some suggest is ignored by most webbrowsers. The W3 spec also literally states "User agents may interpret .. ", not "must". And even then, it would only be used to encode the actual user input, not the hidden fields as in your example. They are already encoded in the page's own encoding (in this case GB2312). In other words, those UTF-8 characters are already corrupted at the moment the page is been processed by the browser.


You can try to do this,

<form name="form" method="post" action="/path/to/some/servlet" charset="gb2312" accept-encoding="gb2312">
<input type="hidden" name="username" value="麗安"> <!-- UTF-8 characters -->
</form>

It might work on some browsers. However, browser is not required to support GB2312 so your mileage may vary.


 <form accept-charset="gb2312"

http://www.w3.org/TR/REC-html40/interact/forms.html#adef-accept-charset

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号