开发者

Extracting tags from PDF [closed]

开发者 https://www.devze.com 2023-04-01 14:43 出处:网络
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references,or expertise, but this question will likely solicit debate, a
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the 开发者_JAVA技巧help center for guidance. Closed 10 years ago.

Can someone recommend a library (Linux binary, jar or source) to extract tag tree from a tagged PDF file? I tried PDFMiner, but it crashed on the first file I tried


Did you try with iText? Take a look on PDFVole for an example of a project that shows this tree visually using iText. You will not be able to link the tree nodes with their curresponding page content with this appoach though.

0

精彩评论

暂无评论...
验证码 换一张
取 消