开发者

A better file format than PDF or EPUB?

开发者 https://www.devze.com 2023-02-05 08:50 出处:网络
My client wants us to build a custom document viewer for their app. (It really, truly needs to be custom, because there are a ton of application-specific features they need.)

My client wants us to build a custom document viewer for their app. (It really, truly needs to be custom, because there are a ton of application-specific features they need.)

We built one for them last year that took PDFs, generated page images, and backed the images using a hidden layer of text that could be sele开发者_JS百科cted and copied. We did it in Flex. It was a nightmare. PDF is horrid.

This year, we need to build one in HTML 5 with similar requirements, except that most of the documents now are in Word or HTML, that is, they have reflowable text, instead of the fixed layout and glyphs of PDF. But they still want to do PDF in the same viewer.

I'm thinking that we need to convert all documents to some common file format that can handle both reflowable text and also the fixed-position glyphs of PDF. (Each document would probably support one or the other, but not both). It would be nice if it were an XML-like markup language that would say:

<text>here's some text</text>

-- or -- 

<glyph letter="a" name="my_a_glyph" position="10,10"/>
<image src="my_image" position="20,20"/>

or something like that.

Is there any existing file format out there that can handle it? EPUB won't do the fixed-position text, and PDF sucks in too many ways to describe.


I think you can look at FB2 (FictionBook 2) format . That is an XML-based format, designed for publishing books. It includes images, though I am not sure if they can be aligned absolutely.

Also, you can simply go with HTML and do HTML-to-PDF rendering when needed (there exists various components and libraries for this). I don't see (or you have not listed) any reasons why this way doesn't work.


GROFF? Maybe build a macro library to customize it, as needed.

Groff/troff/nroff, the "run off" programs of Unix, can output to postscript or HTML. The jump from postscript to PDF is built in to some PDF viewers; there are also several existing programs for it, pstopdf, for example.

GROFF has some fixed layout options and some flow-like options. With GROFF, it's almost easier to base most of the printout on flowing text, within proscribed bounds.

0

精彩评论

暂无评论...
验证码 换一张
取 消