We are displaying HTML body extracted from .MSG files exported from Outlook.
To display the HTML body, one needs to decompress RTF from PR_RTF_Compressed
field and then decode the RTF to HTML (outlook actually encodes HTML to RTF when exporting MSG files). We are using RDO library to parse the msg fi开发者_开发问答les and extract the HTML body.
RDO produces some HTML that is not always the same as Outlook displays (text size sometimes does not match etc.)
Is anybody aware of an implementation of HTML body extraction that would most closely match the appearance of HTML displayed by Outlook or is this impossible?
more thoughts than an answer...
Are you displaying the extracted body in a browser such as IE?
I expect that the issue is that Outlook (2007) uses the Word rendering engine to display HTML while browsers use their own. So, I don't think you are likely to find an extraction implementation that will help.
Can you apply a stylesheet to your extracted body document, that will override most of the inconsistencies?
精彩评论