开发者

Force pdflatex to create one box per word

开发者 https://www.devze.com 2023-01-05 18:45 出处:网络
I\'m converting ebook files to ereader-optimized pdf files (the sony ereader can\'t propertly justify text). I\'m therefore converting html to latex, and then building the latex output using pdflatex.

I'm converting ebook files to ereader-optimized pdf files (the sony ereader can't propertly justify text). I'm therefore converting html to latex, and then building the latex output using pdflatex.

The sony reader has a function to lookup words in a dictionary. However, it figures out words by ana开发者_开发问答lysing boxes; and pdflatex generates one box per line. I subsequently have lost the possibility to use the dictionary search.

How do I tell pdflatex to put each word in a separate box?

EDIT:

I'm trying to tweak the output of the pdflatex command to make it produce one box per word. Consider this example:

\documentclass{minimal}

\begin{document}
    This is an example sentence.
\end{document}

When opened in a PDF editor after compilation, this sample will appear as one text box containing the sentence "This is an example sentence.". This is fine for most full-featured pdf readers. Yet on my sony e-reader, selection of words is based on boxes ; therefore my pdf reader will select the full sentence, hence failing to find a definition for the word I clicked.

I noticed that pdflatex stops at punctuation marks. How can I proceed to make it create one box per word? In the output, I would then have one box for "This", one for "is", one for "an", and so on.


I'm guessing your trouble is not with boxes, but with your font encoding. Try putting the following just after your \documentclass{minimal}:

\usepackage{cmap} % Puts extra info in the PDF's font dictionary that helps searching
\usepackage{lmodern} % cmr, the default Tex font, has a whacky font layout
\usepackage[T1]{fontenc} % This and next line are recommended with lmodern
\usepackage{textcomp}


Set the hyphenation penalty to 10000 (effective infinity)

\hyphenpenalty=10000

and perhaps increase the typesetting tolerance

\tolerance=1000

See http://dcwww.camd.dtu.dk/~schiotz/comp/LatexTips/LatexTips.html#nohyphen.


In case you don't know this, TeX makes layout decisions by assigning penalties to bad stuff (too much or too little white space (horizontal or vertical), widow or orphan lines, over- or under-full boxes, splitting footnotes across pages, and so on ad nauseam), then tries to minimize the per-page penalty.

You can diddle the kinds of choices it makes quite extensively by adjusting the penalty values. Any arrangement which scores 10000 is absolutely forbidden, and I guess that if there is no arrangement which scores less the run stops.

0

精彩评论

暂无评论...
验证码 换一张
取 消