开发者

Why are .docx files being corrupted when downloading from an ASP.NET page?

开发者 https://www.devze.com 2022-12-24 00:40 出处:网络
I have this following code for bringing page attachments to the user: private void GetFile(string package, string filename)

I have this following code for bringing page attachments to the user:

private void GetFile(string package, string filename)
{
    var stream = new MemoryStream();

    try
    {
        using (ZipFile zip = ZipFile.Read(package))
        {
            zip[filename].Extract(stream);
        }
    }
    catch (System.Exception ex)
    {
        throw new Exception("Resources_FileNotFound", ex);
    }

    Response.ClearContent();
    Response.ClearHeaders();
    Response.ContentType = "application/unknown";

    if (filename.EndsWith(".docx"))
    {
        Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
    }

    Response.AddHeader("Content-Disposition", "attachment;filename=\"" + filename + "\"");
    Response.BinaryWrite(stream.GetBuffer());
    stream.Dispose();
    Response.Flush();
    HttpContext.Current.ApplicationInstance.CompleteRequest();
}

The problem is that all supported files works properly (jpg, gif, png, pdf, doc, etc), but .docx files, when downloaded, are corrupted and they need to be fixed by Office in order to be opened.

At first I didn't know if the problem was at uncompressing the zip file that contained the .docx, so instead of putting the output file only in the res开发者_高级运维ponse, I saved it first, and the file opened successfully, so I know the problem should be at response writing.

Do you know what can be happening?


I also ran into this problem and actually found the answer here:

It turns out that the docx format needs to have Response.End() right after the Response.BinaryWrite.


When storing a binary file in SQL Server, keep in mind that a file is padded to the nearest word boundry, so you can potentially have an extra byte added to a file. The solution is to store the original file size in the db when you store the file, and use that for the length that needs to be passed to the write function of the Stream object. "Stream.Write(bytes(), 0, length)". This is the ONLY reliable way of getting the correct file size, which is very important for Office 2007 and up files, which do not allow extra characters to be on the end of them (most other file types like jpg's don't care).


You should not use stream.GetBuffer() because it returns the buffer array which might contain unused bytes. Use stream.ToArray() instead. Also, have you tried calling stream.Seek(0, SeekOrigin.Begin) before writing anything?

Best Regards,
Oliver Hanappi


For what it's worth, I also ran into the same problem listed here. For me the issue was actually with the upload code not the download code:

    Public Sub ImportStream(FileStream As Stream)
        'Use this method with FileUpload.PostedFile.InputStream as a parameter, for example.
        Dim arrBuffer(FileStream.Length) As Byte
        FileStream.Seek(0, SeekOrigin.Begin)
        FileStream.Read(arrBuffer, 0, FileStream.Length)
        Me.FileImage = arrBuffer
    End Sub

In this example the problem is I declare the Byte array arrBuffer with a size one byte too large. This null byte is then saved with the file image to the DB and reproduced on download. The corrected code would be:

        Dim arrBuffer(FileStream.Length - 1) As Byte

Also for reference my HttpResponse code is as follows:

                context.Response.Clear()
                context.Response.ClearHeaders()
                'SetContentType() is a function which looks up the correct mime type
                'and also adds and informational header about the lookup process...
                context.Response.ContentType = SetContentType(objPostedFile.FileName, context.Response)
                context.Response.AddHeader("content-disposition", "attachment;filename=" & HttpUtility.UrlPathEncode(objPostedFile.FileName))
                'For reference: Public Property FileImage As Byte()
                context.Response.BinaryWrite(objPostedFile.FileImage)
                context.Response.Flush()


If you use the approach above which uses response.Close(), Download managers such as IE10 will say 'cannot download file' because the byte lengths do not match the headers. See the documentation. Do NOT use response.Close. EVER.

However, using the CompeteRequest verb alone does not shut off the writing of bytes to the output stream so XML based applications such as WORD 2007 will see the docx as corrupted.

In this case, break the rule to NEVER use Response.End. The following code solves both problems. Your results may vary:

'*** transfer package file memory buffer to output stream
Response.ClearContent()
Response.ClearHeaders()
Response.AddHeader("content-disposition", "attachment; filename=" + NewDocFileName)
Me.Response.ContentType = "application/vnd.ms-word.document.12"
Response.ContentEncoding = System.Text.Encoding.UTF8
strDocument.Position = 0
strDocument.WriteTo(Response.OutputStream)
strDocument.Close()
Response.Flush()
'See documentation at http://blogs.msdn.com/b/aspnetue/archive/2010/05/25/response-end-response-close-and-how-customer-feedback-helps-us-improve-msdn-documentation.aspx
HttpContext.Current.ApplicationInstance.CompleteRequest() 'This is the preferred method
'Response.Close() 'BAD pattern. Do not use this approach, will cause 'cannot download file' in IE10 and other download managers that compare content-Header to actual byte count
Response.End() 'BAD Pattern as well. However, CompleteRequest does not terminate sending bytes, so Word or other XML based appns will see the file as corrupted. So use this to solve it.


It all looks ok. My only idea is to try calling Dispose on your stream after calling Response.Flush instead of before, just in case the bytes aren't entirely written before flushing.


Take a look a this: Writing MemoryStream to Response Object

I had the same problem and the only solution that worked for me was:

    Response.Clear();
    Response.ContentType = "Application/msword";
    Response.AddHeader("Content-Disposition", "attachment; filename=myfile.docx");
    Response.BinaryWrite(myMemoryStream.ToArray());
    // myMemoryStream.WriteTo(Response.OutputStream); //works too
    Response.Flush();
    Response.Close();
    Response.End();


I had the same problem while i try to open .docx and .xlsx documents. I solve the problem by defining the cacheability to ServerAndPrivate instead of NoCache

there is my method to call document:

public void ProcessRequest(HttpContext context)

 {


       var fi = new FileInfo(context.Request.Path);
        var mediaId = ResolveMediaIdFromName(fi.Name);
        if (mediaId == null) return;

        int mediaContentId;
        if (!int.TryParse(mediaId, out mediaContentId)) return;

        var media = _repository.GetPublicationMediaById(mediaContentId);
        if (media == null) return;

        var fileNameFull = string.Format("{0}{1}", media.Name, media.Extension);
        context.Response.Clear();
        context.Response.AddHeader("content-disposition", string.Format("attachment;filename={0}", fileNameFull));            
        context.Response.Charset = "";
        context.Response.Cache.SetCacheability(HttpCacheability.ServerAndPrivate);
        context.Response.ContentType = media.ContentType;
        context.Response.BinaryWrite(media.Content);
        context.Response.Flush();          
        context.Response.End();          
    }


Geoff Tanaka's answer also works for Response.Writefile not just binarywrite ie adding Response.End() after it gets rid of Office document corruption error "Word found unreadable content". Turns out all the messing about with Response.ContentType was unneccessary and I can now revert to "application/octet-stream". Another afternoon I'll never get back.

0

精彩评论

暂无评论...
验证码 换一张
取 消