I am looking the hyphenation aglorithm downloaded from the OpenOffice site, but I couldn't understand what the parameter rep, pos, and cut are for after looking at the comment. Could someone with开发者_开发百科 the knowledge tell me what these parameters do? Here are the comments.
From the example, it seems like it's saying ff can be replaced with a single f, but what does that have to do with hyphenation?
Thanks,
/*
int hnj_hyphen_hyphenate2(): non-standard hyphenation.
(It supports Catalan, Dutch, German, Hungarian, Norwegian, Swedish
etc. orthography, see documentation.)
input data:
word: input word
word_size: byte length of the input word
hyphens: allocated character buffer (size = word_size + 5)
hyphenated_word: allocated character buffer (size ~ word_size * 2) or NULL
rep, pos, cut: pointers (point to the allocated and zeroed buffers
(size=word_size) or with NULL value) or NULL
output data:
hyphens: hyphenation vector (hyphenation points signed with odd numbers)
hyphenated_word: hyphenated input word (hyphens signed with ='),
optional (NULL input)
rep: NULL (only standard hyph.), or replacements (hyphenation points
signed with
=' in replacements);
pos: NULL, or difference of the actual position and the beginning
positions of the change in input words;
cut: NULL, or counts of the removed characters of the original words
at hyphenation,
Note: rep, pos, cut are complementary arrays to the hyphens, indexed with the
character positions of the input word.
For example:
Schiffahrt -> Schiff=fahrt,
pattern: f1f/ff=f,1,2
output: rep[5]="ff=f", pos[5] = 1, cut[5] = 2
Note: hnj_hyphen_hyphenate2() can allocate rep, pos, cut (word_size
length arrays):
char ** rep = NULL;
int * pos = NULL;
int * cut = NULL;
char hyphens[MAXWORDLEN];
hnj_hyphen_hyphenate2(dict, "example", 7, hyphens, NULL, &rep, &pos, &cut);
See example in the source distribution.
*/
int hnj_hyphen_hyphenate2 (HyphenDict *dict,
const char *word, int word_size, char * hyphens,
char *hyphenated_word, char * rep, int ** pos, int ** cut);
I believe you are referring to the following comment:
// For example: // Schiffahrt -> Schiff=fahrt, // pattern: f1f/ff=f,1,2 // output: rep[5]="ff=f", pos[5] = 1, cut[5] = 2
The example refers to German hyphenation rules as they were before the spelling reform from the 1990ies. Compound nouns in German are written as one word and according to the old rules the third consonant such as the 'f' in the word 'Schifffahrt' (constisting of 'Schiff' and 'Fahrt') was omitted in case that a vowel is following ('Schifffahrt' was written as 'Schiffahrt'), but the omitted letter was still written when hyphenating.
So the meaning of the example is not that 'ff' can be replaced with a single 'f', but rather that 'ff' can be replaced with 'ff-f'.
The meaning of the parameters therefore would be:
rep
: contains the replacement 'ff-f' which is used instead of 'ff'pos
: a value of 1 means that the replacement starts one letter before the hyphenation posistion of 5cut
: a value of 2 means that 2 characters need to be removed from the input word.
These parameters only seem to be used for the rare case that a word is spelled differently when hyphenated.
精彩评论