开发者

Regexp character filter

开发者 https://www.devze.com 2022-12-09 03:46 出处:网络
In my code, I use a regexp I googled somewhere, but I don\'t understand it. :) preg_match(\"/^[\\p{L} 0-9\\-]{4,25}$/\", $login))

In my code, I use a regexp I googled somewhere, but I don't understand it. :)

preg_match("/^[\p{L} 0-9\-]{4,25}$/", $login))

What does that p{L} mean? I know what it does -- all characters with national letters included.

And my second question, I want to sanitize user input for ingame chat, so I'm starting with the regexp mentioned above, but I want to allow most special characters. What's the shortest way开发者_JAVA技巧 to do it? Has someone already prepared a regexp to do it?


For \p see Unicode character properties basically it require the character to be in a specific character class (Letter, number, ...).

For your filter it depends on what exactly you want to filter but looking at Unicode character classes is the good way to go i think (adding individually any character that seem useful to you).


The regular expression means:

Each string with length between 4 and 25, starting with a letter, a space, a number or dash.

\p{L} means literally: a character that matches the property "L", where "L" stands for "any letter".

To understand how regexp work:

http://en.wikipedia.org/wiki/Regular_expression

http://www.php.net/manual/en/regexp.reference.unicode.php


If you want to include most characters why not just exclude the ones that you are not allowing?

You can do this with the ^ in your character class

[^characters I don't want]

Disclaimer: Black listing might not be the best approach depending on what you're trying to do, and has to be more thorough than white listing.

0

精彩评论

暂无评论...
验证码 换一张
取 消