开发者

In HTML and CSS, how do I make japanese text break lines correctly?

开发者 https://www.devze.com 2023-02-15 16:16 出处:网络
I\'m writting a simple paragraph in both English and Japanese, using only HTML and CSS. The English text breaks lines normally (when a word doesn\'t fit on a line anymore, it\'s pushed to the next one

I'm writting a simple paragraph in both English and Japanese, using only HTML and CSS. The English text breaks lines normally (when a word doesn't fit on a line anymore, it's pushed to the next one).

With Japanese though, not a whole word is pushed to the next line, but part of it only. I've tried setting word-wrap to break-word and normal, but nothing changes (with the Japanese text).

How to I make whole wo开发者_运维百科rds in Japanese jump to the next line like it happens in English?


English separates words with spaces, Japanese doesn't.

Whether characters in Japanese form a word or not depends on context. In many cases, looking for certain grammatical (Kana) particles could be used to separate words - but this wouldn't even be close to being reliable.

Essentially, you'd need a Japanese dictionary / understanding of the language to identify where the words start and end - a browser won't know how to do this.

Alternatively, if you know the start and end of the words, you could perhaps wrap each one in a span - then use CSS to ensure each span wraps to a new line as a whole when it doesn't fit.


Japanese has specific rules that are followed when breaking text. They are called 禁則処理 (kinsoku shori). Here is a link explaining the rules. The rules are mostly concerned with special characters. Have a look at any popular Japanese webpage and you will see that multi-character (kana and kanji) words are often split. I often see です split between lines.

Update: I stumbled across this tool recently. I haven't tried it out yet, but the theory is solid. If someone is looking to improve the line breaks with Japanese text this could be a good solution.


I'm not an expert with Japanese specifically so it's hard for me to tell if things are wrapping correctly, but I just had to solve this problem myself and both word-break: keep-all and white-space: nowrap seemed to solve the issue for me, so those might be worth trying out.


Until the browsers are smart enough to do on-the-fly semantic analysis of the language, there are only a couple of options :

1/ Understand enough of the language to be able to group semantic elements in their own, unbreakable DOM elements. Something like (without the line breaks) :

<span class="el">私は</span>
<span class="el">キッチンで</span>
<span class="el">パンを</span>
<span class="el">食べました。</span>

Then in CSS, use something like .el { display: inline-block; }. You probably want to do this only on headings and important text pieces only, since it could impact accessibility (ie. how screen readers interpret the text). The other inconvenients are that 1/ you need to understand the text to know where to add the blocks, and 2/ this obviously only works for static text (and even in that case, it's still a manual, painstaking process).

2/ Use a tool that does the grouping for you. It could be something on the client side, like TinySegmenter (whitch does segment a bit too much for my taste IMHO), or on the server-side, with things like Budou that use Google Cloud Natural Language API and ML to analyze your sentences. The downsides (at least for Budou) is that 1/ you need Python (I think that I saw a Node.js port somewhere), and 2/ It's not free.

Hope this helps!


try setting the css property

line-break:strict;

Check it out here.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号