OpenType Font Parsing for Pleasure and Profit (anyone understand these stupid tables?)_问答_开发者

OpenType Font Parsing for Pleasure and Profit (anyone understand these stupid tables?)

开发者 https://www.devze.com 2023-01-01 17:48 出处：网络

So, this is mainly for fun, I\'m poking around and trying to find my way inside a few fonts and I have a few questions I\'d really appreciate some help on if anyone has done this kind of stuff.

So, this is mainly for fun, I'm poking around and trying to find my way inside a few fonts and I have a few questions I'd really appreciate some help on if anyone has done this kind of stuff.

cmap table

The fonts I am testing with contain several cmap subtables of different formats. I can read them, but I don't understand which I should be using. i开发者_JAVA技巧e. what is the strategy for choosing the most appropriate subtable? Does this even make sense?

glyf table

This is really making my head hurt. I'm going by what is on here. Looking at the second table on that page, I've got 'n' endPtsOfContours, 'n' instructions and 'n' flags but it is not clear to me if I have the same number of flags as contours (I know how many contours I have). Then, to make matters worse..(fun!) I have an array of xCoords and an array of yCoords. These arrays seem to be of indeterminate length and may contain data of either BYTE or SHORT but we are not going to tell you which.

Ok, I suppose this is what the instructions and flags are for but as you can probably tell I don't really know how to deal with them. Do I need a TrueType interpreter to access the coordinate data?

You are correct, of course.

flags bit 1: If set, the corresponding x-coordinate is 1 byte long. If not set, 2 bytes.

flags bit 2: If set, the corresponding y-coordinate is 1 byte long. If not set, 2 bytes.

I wrote code to walk the TrueType tables a long time ago--in C of course. I suppose you can compare your results with the output of TTFDump (still available from Microsoft).

For the cmap encoding, favor any Unicode encoding first, either platform id = 0 or platform id = 3 with encoding id = 10 or 1 (platform id list on the name table), and favor cmap format 12 (complete Unicode space) over 4 (only the basic multilingual plane). After that, the relative priorities of the encodings become more vague {Wansung, BIG5, PRC, Shift-JIS...}, but also less important since a font tends to be mainly a Japanese, Chinese, or Korean font - not all the above at once. Formats 4 and 12 are by far the most common over 0, 2, 6. Format 14 can be found in CJK fonts with variation selectors as a supplement to format 4 or 12, and format 13 can be found a special "last resort" font (used during font fallback when no good choice supports the given text).