开发者

Looking for a linux PDF library to extract annotations and images from a PDF [closed]

开发者 https://www.devze.com 2023-03-29 05:41 出处:网络
Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines. It is not currently accepting answers.
开发者_JAVA百科

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 8 years ago.

Improve this question

I'm looking for a free library (Java/Ruby), that can run on linux, and can extract images and annotations from PDFs; similar to what CGPDFDocument can do on OS X.

Thanks!


I don't know about images, but using the last version of the ruby pdfreader library I was able to succesfully extract the annotations from a big PDF file:

PDF::Reader.open(filename) do |reader|
  reader.pages.each do |page|
    annots_ref = page.attributes[:Annots]
    actual_annots = reader.objects[annots_ref]
    if actual_annots && actual_annots.size > 0
      actual_annots.each do |annot_ref|
        actual_annot = reader.objects[annot_ref]
          unless actual_annot[:Contents].nil?
            puts "Page #{page.number},"+actual_annot[:Contents].inspect
          end
        end
    end
  end       
end

I imagine that something like it could be done to extract images.

0

精彩评论

暂无评论...
验证码 换一张
取 消