h开发者_如何学运维i guys working on a app which main work is pdf editing.
i understand Apple doesn't provide any api for editing the pdf. but my requirements are like that.
so i thought of extracting the whole contents of the pdf file and create a new pdf after editing. now i need to know how to extract the pdf formatting (header, footer, images, highlighting.,,)
im using Tj operators to extract the pdf text. which operators should i use to extract the other informations of pdf file.
thanks in advance.
Images are painted on the page using the Do operator. Its operand is the image name in the resources dictionary. The Do operator also paints form XObjects (self contained vector graphics) and these are stored also in the resources dictionary. The Subtype key in the image/form XObject dictionary gives you the object type: "Image" for images and "Form" for form XObjects.
The other elements are plain vector graphics and text, the PDF files do not have headers, footers, paragraphs, etc as standalone objects. What you see visually as a page header, inside the PDF file is just plain text painted at the top of the page.
Highlights can be plain semi-transparent yellow rectangles (these are no different from other rectangles on the page) or highlight annotations (these are available in page's Annots array).
精彩评论