I'm using antiword to read doc files in php:
<?php
$filename = 'sample.doc';
$content = shell_exec('C:\\wamp\\www\\tester\\read_documents\\antiword\\antiword '.$filename);
echo $content;
?&g开发者_C百科t;
Is it possible to format the output?
manual page of antiword says:
-f Output in formatted text form. That means that bold text is printed like *bold*, italics like /italics/ and underlined text as _underlined_.
-p papersize Output in PostScript form. Printable on paper of the specified size: 10x14, a3, a4, a5, b4, b5, executive, folio, legal, letter, note, quarto, statement or tabloid.
-t Output in text form. (default)
-x document type definition Output in XML form. Currently the only document type definition is db (for DocBook).
So, you can choose various output formats; by default, antiword strips out all formatting. I'd guess you want the -f
option, and possibly replace that simple formatting into HTML tags.
精彩评论