开发者

Can someone explain this regular expression?

开发者 https://www.devze.com 2023-02-04 12:18 出处:网络
/^[\\p{Ll}\\p{Lm}\\p{Lo}\\p{Lt}\\p{Lu}\\p{Nd}]+$/mu This is the regular expression validation that cakePHP uses to validate a开发者_JAVA技巧lphanumeric strings. I am unable to understand what Ll, Lm
/^[\p{Ll}\p{Lm}\p{Lo}\p{Lt}\p{Lu}\p{Nd}]+$/mu

This is the regular expression validation that cakePHP uses to validate a开发者_JAVA技巧lphanumeric strings. I am unable to understand what Ll, Lm, Lt etc are? This is to validate alphanumeric strings, so they should test for numbers and characters. Could someone explain this expression a little.

Thank you.


Ll, Lm, Lo, Lt, Lu, Nd are unicode character classes.

See here at around 1/3 of the page:

http://www.regular-expressions.info/unicode.html

  • \p{Ll} or \p{Lowercase_Letter}: a lowercase letter that has an uppercase variant.
  • \p{Lu} or \p{Uppercase_Letter}: an uppercase letter that has a lowercase variant.
  • \p{Lt} or \p{Titlecase_Letter}: a letter that appears at the start of a word when only the first letter of the word is capitalized.
  • \p{L&} or \p{Letter&}: a letter that exists in lowercase and uppercase variants (combination of Ll, Lu and Lt).
  • \p{Lm} or \p{Modifier_Letter}: a special character that is used like a letter.
  • \p{Lo} or \p{Other_Letter}: a letter or ideograph that does not have lowercase and uppercase variants.


The code between the curly brackets (Li, Lm, Lt, etc) are classes of Unicode characters. A quick google for Unicode character classes produces for example the following list: http://www.siao2.com/2005/04/23/411106.aspx


If you regularily stumble upon weird regular expressions, try one of these: https://stackoverflow.com/questions/89718/is-there-anything-like-regexbuddy-in-the-open-source-world - albeit I'm not sure if they explain those (mostly Unicode?) placeholders. Otherwise check out the list on http://regular-expressions.info/

0

精彩评论

暂无评论...
验证码 换一张
取 消