开发者

PDF Parser API in Java [closed]

开发者 https://www.devze.com 2023-01-07 14:26 出处:网络
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has b开发者_开发问答een done so far to solve it.

Closed 9 years ago.

Improve this question

I want to convert the pdf data into our own file specifications. So pls help me out to choose the correct API for PDF parsing using java or .net. The parsing should extract each and every component(element) from the PDF pages.


There's a library called IText that does what you want. It's sort of the #1 product out there and is free as in beer.

I've worked with IText before, extracting content from PDFs, and while it's not super-duper automatic, it allows you to get at everything.

Recommended, in other words.


Elements do not exist in the PDF file. It is a set of Pdfobjects which generate the pages.


Try PDF Box http://java-source.net/open-source/pdf-libraries/pdf-box

Hope it will help.

0

精彩评论

暂无评论...
验证码 换一张
取 消