Break the string on Full Stop for (Chinese, Arabic, Japanese, Russian, Korean, Dutch, Hindi, Greek, Urdu) using javascript_问答_开发者

Break the string on Full Stop for (Chinese, Arabic, Japanese, Russian, Korean, Dutch, Hindi, Greek, Urdu) using javascript

开发者 https://www.devze.com 2023-01-21 19:09 出处：网络

I am working on languge segmentation project. I applied language segmentation for English by using regular expression breaking the string at . (\"Full Stop\"). Now i want to provide the support for fo

相关专题：javascript php

I am working on languge segmentation project. I applied language segmentation for English by using regular expression breaking the string at . ("Full Stop"). Now i want to provide the support for following languages (Chinese, Arabic, Japanese, Russian, Korean, Dutch, Hindi, Greek, Urdu). I want to break the above mentioned language strings on Full stop.

e.g.

For Chinese Full stop is 。 (Unicode value U+3002) String

以有效應對各種事態」。他還表示，希开发者_高级运维望以符合21世紀的方式切實深化美日同盟關係。

Expected Result

Segment 1 :- 以有效應對各種事態」。
Segment 2 :- 他還表示，希望以符合21世紀的方式切實深化美日同盟關係。

Same logic I have to apply for other languages (Arabic, Japanese, Russian, Korean, Dutch, Hindi, Greek, Urdu).

See String.split. You can use /([。])/ as a regular expression separator. Add the other punctuation characters inside the square brackets. The round parentheses will capture your delimiters.

In php you might use preg_split( REGEX , $yourString );

Replace the word REGEX with your regular expression. Possibly like @janmoesen mentioned.