In my code, I use a regexp I googled somewhere, but I don't understand it. :)
preg_match("/^[\p{L} 0-9\-]{4,25}$/", $login))
What does that p{L}
mean? I know what it does -- all characters with national letters included.
And my second question, I want to sanitize user input for ingame chat, so I'm starting with the regexp mentioned above, but I want to allow most special characters. What's the shortest way开发者_JAVA技巧 to do it? Has someone already prepared a regexp to do it?
For \p see Unicode character properties basically it require the character to be in a specific character class (Letter, number, ...).
For your filter it depends on what exactly you want to filter but looking at Unicode character classes is the good way to go i think (adding individually any character that seem useful to you).
The regular expression means:
Each string with length between 4 and 25, starting with a letter, a space, a number or dash.
\p{L} means literally: a character that matches the property "L", where "L" stands for "any letter".
To understand how regexp work:
http://en.wikipedia.org/wiki/Regular_expression
http://www.php.net/manual/en/regexp.reference.unicode.php
If you want to include most characters why not just exclude the ones that you are not allowing?
You can do this with the ^ in your character class
[^characters I don't want]
Disclaimer: Black listing might not be the best approach depending on what you're trying to do, and has to be more thorough than white listing.
精彩评论