开发者

Segmentation rules for non latin based languages like Chinese, Japanese

开发者 https://www.devze.com 2022-12-18 18:24 出处:网络
While exploring globalsight.com ,I came across the segmentation rules(link).It uses full stop(.) as a language delimiter. which segmentaion rules can we use for segment the

While exploring globalsight.com ,I came across the segmentation rules(link).It uses full stop(.) as a language delimiter. which segmentaion rules can we use for segment the non latin based Languages for which a dot(.) mean something other than a delimiter or for the languages which don't have any delimite开发者_开发知识库rs Example –Chinese,Japanese and Korean

What are the language segmentation rules used for these “non latin”(Chinese,Japanese) languages? How are the segmentation rules developed ?

Thanks in advance, Manjushree


Japanese uses kinsoku shori. Not sure about the other two though.


Trados, the leading translation memory application, uses the following segmentation rules:

For Japanese and Chinese:

Full Stop:

Colons: ::

Punctuation: ?!?!

0

精彩评论

暂无评论...
验证码 换一张
取 消