开发者

Is it possible to output formats other than .docx and .odt with TinyButStrong and OpenTBS plugin

开发者 https://www.devze.com 2023-03-07 18:18 出处:网络
I have a module which merges a document from database records and .docx or .odt document mode开发者_开发知识库l.

I have a module which merges a document from database records and .docx or .odt document mode开发者_开发知识库l.

I have to output .docx, .odt or .pdf. For outputting to Microsoft and Open formats, there is no problem, all works properly.

But what I want to know is, can I output to a format (like XML or HTML) which I can use to subsequently build a PDF document?

If I can't, are there any libraries which provide a merge document capability like:

          DOCX (or ODT) + database record => PDF

And I don't want to use phplivedocx.


I successfully put a portable version of libreoffice on my host's webserver, which I call with PHP to do a commandline conversion from .docx, etc. to pdf. on the fly. I do not have admin rights on my host's webserver. Here is my blog post of what I did:

http://geekswithblogs.net/robertphyatt/archive/2011/11/19/converting-.docx-to-pdf-or-.doc-to-pdf-or-.doc.aspx

Yay! Convert directly from .docx or .odt to .pdf using PHP with LibreOffice (OpenOffice's successor)!


I don't know any PHP library that does DOCX => PDF. In fact, the DOCX conversion to something else in PHP is an opened problem today. This is independent from how you made the DOCX.

But as you said, they are PHP libraries for HTML => PDF.

Html2Pdf is a well reputed PHP library that does HTML => PDF. There is also DomPdf.

So if you can found a PHP library for DOCX => HTML, then it would work.

Of course it has some limitations because even if both PDF and DOCX are opened format, they have very specific features, they need huge rendering process, and the editors keep some good tips for them.

Converting DOCX to HTML is theoretically possible. There is a Windows software that does it by EpingSoft. If you need to do it in PHP, some web articles tell you how to make it, but since I cannot found any PHP code doing this, I guess it is more theoretical than practical.

http://www.quepublishing.com/articles/article.aspx?p=691502

How complicated that process would be depends on how much of Word's native formatting you need to preserve during the conversion.

If you want to try this way, it's good to know that OpenTBS enables you to read the XML before and after the merge. It is based on a PHP class names TbsZip that can read any XML file in the DOCX since it's in fact a zip archive.


There is also posible to use PDF files directly in TBS after decompressing:

qpdf --qdf --object-streams=disable in.pdf out.pdf
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号