I am trying to send some very large files (>200MB) through an Http output stream from a Java client to a servlet ru开发者_开发技巧nning in Tomcat.
My protocol currently packages the file contents in a byte[]
and that is placed a a Map<String, Object>
along with some metadata (filename, etc.), each part under a "standard" key ("FILENAME"
-> "Foo"
, "CONTENTS"
-> byte[]
, "USERID"
-> 1234
, etc.). The Map
is written to the URL connection output stream (urlConnection.getOutputStream()
). This works well when the file contents are small (<25MB), but I am running into Tomcat memory issues (OutOfMemoryError
) when the file size is very large.
I thought of sending the metadata Map
first, followed by the file contents, and finally by a checksum on the file data. The receiver servlet can then read the metadata from its input stream, then read bytes until the entire file is finished, finally followed by reading the checksum.
Would it be better to send the metadata in connection headers? If so, how? If I send the metadata down the socket first, followed by the file contents, is there some kind of standard protocol for doing this?
You will almost certainly want to use a multipart POST to send the data to the server. Then on the server you can use something like commons-fileupload to process the upload.
The good thing about commons-fileupload is that it understands that the server may not have enough memory to buffer large files and will automatically stream the uploaded data to disk once it exceeds a certain size, which is quite helpful in avoiding OutOfMemoryError
type problems.
Otherwise you are going to have to implement something comparable yourself. It doesn't really make much difference how you package and send your data, so long as the server can 1) parse the upload and 2) redirect data to a file so that it doesn't ever have to buffer the entire request in memory at once. As mentioned both of these come free if you use commons-fileupload, so that's definitely what I'd recommend.
I don't have a direct answer for you but you might consider using FTP instead. Apache Mina provides FTPLets, essentially servlets that respond to FTP events (see http://mina.apache.org/ftpserver/ftplet.html for details).
This would allow you to push your data in any format without requiring the receiving end to accommodate the entire data in memory.
Regards.
精彩评论