开发者

Is there any method in ruby or in python to convert pdf to xml

开发者 https://www.devze.com 2023-03-21 07:34 出处:网络
Recently I have got an requirement like i need to convert pdf files to xml.So please tell me how to do it in pyth开发者_C百科on or ruby programming languages.

Recently I have got an requirement like i need to convert pdf files to xml.So please tell me how to do it in pyth开发者_C百科on or ruby programming languages. please reply to my question as early as possible your help would be appreciated..


Use pdftohtml.

It can be installed with brew install pdftohtml. This adds pdftohtml to your path.

So, to convert pdf to xml, you can run pdftohtml -xml your_file.pdf your_file.xml

In python, you could use these lines:

import os

cmd = 'pdftohtml -xml your_file.pdf your_file'
os.system(cmd)

!note that pdftohtml automatically adds the 'xml' extension to your second file input

0

精彩评论

暂无评论...
验证码 换一张
取 消