I have a problem which I have not been able to find an answer to in months. I store word doc resumes as varbinary(max). I can retrieve the resumes based on a full-text search – no problem. But the resumes are retrieved as word documents in a .ashx file with the following code. I really need to implement hit highlighting on the site so that users can see if the returned resume is a good fit or not. I don’t think this can be done from an .ashx file, so I think I need to be able to open the resume as html in an aspx page and maybe use javascript to do the hit highlighting or perhaps return the text only content of the word document somehow and manipulate the text before display with html tags. I cant find anything anywhere which addresses the problem. I am really hoping that someone can point me in the right direction.
Thanks in advance for any advice.
Imports System.io
Imports System.Web
Imports System.Data
Imports System.Data.SqlClient
Public Class ReadResume : Implements IHttpHandler
Const conString As String = "Data Source=tcp:sql2k804.discountasp.net;Initial Catalog=SQL2008R2_284060_resumedata;User ID=SQL2008R2_284060_resumedata_user;Password=mypwd2314;"
Public Sub ProcessRequest(ByVal context As HttpContext) Implements IHttpHandler.ProcessRequest
Dim con As SqlConnection = New SqlConnection(conString)
Dim cmd As SqlCommand = New SqlCommand("Select ResumeDoc, DocTypeExtension From ResumeTable WHERE CandidateId=@CandidateId", con)
Dim CId As String = System.Web.HttpContext.Current.Request.QueryString("Para")
cmd.Parameters.AddWithValue("@CandidateId", CId)
Using con
con.Open()
Dim myReader As SqlDataReader = cmd.ExecuteReader
If myReader.Read() Then
context.Response.Clear()
context.Response.ClearContent()
context.Response.ClearHeaders()
Dim file() As Byte = CType(myReader("ResumeDoc"), Byte())
Dim doc_type As String = CType(myReader("DocTypeExtension"), String)
context.Res开发者_StackOverflow社区ponse.ContentEncoding = System.Text.Encoding.UTF8
context.Response.ContentType = "Application/msword"
context.Response.AddHeader("content-disposition", "Candidate Resume")
context.Response.BinaryWrite(file)
End If
End Using
End Sub
Public ReadOnly Property IsReusable() As Boolean Implements IHttpHandler.IsReusable
Get
Return False
End Get
End Property
End Class
You can use Microsoft Office COM components to deal with Word documents. For example, that is the way to convert Word to HTML: http://rongchaua.net/blog/c-convert-word-to-html/
UPDATE: There are other solutions.
If you have only .docx (not .doc) documents then you can use this simple code to extract plain text from docx documents: http://www.codeproject.com/KB/office/ExtractTextFromDOCXs.aspx This is the same code: http://conceptdev.blogspot.com/2007/03/open-docx-using-c-to-extract-text-for.html
There are some commercial libraries for reading/writing Word documents: http://www.aspose.com/categories/.net-components/aspose.words-for-.net/default.aspx http://www.cellbi.com/Products.aspx
精彩评论