Hello All,
I am trying to use preg_match to identify if a single word found within a string of text. This word needs to be picked up if there are multiple instances of each character within the word (in the correct order). To make life hard for myself I a开发者_运维百科lso want to pick up on the word even if the client has tried to 'fool' the preg_match by means of entering certain characters within the word I wish to match.
It is for use in a swearword filter, if 'dave' is found I will replace it with something else. I have tried to come up with the perfect regular expression but I'm not having much luck. Please see the following examples and the issues I have found so far (I have used 3 as an example character the client could use to 'fool' the check);
Using: ~\b(?:3+)?d+(?:3+)?a+(?:3+)?v+(?:3+)?e+(?:3+)?\b~i
Okay
- Input: dave = pass
- Input: 3d3a3v3e3 = pass
- Input: ddddaaaavvvveeee = pass
- Input: 3ave = fail
Not Okay
- Input: dd3ddaa3aa3vv3vvee3ee = fail (I want this to pass)
Using: ~\b[d3]+[a3]+[v3]+[e3]+\b~i
Okay
- Input: dave = pass
- Input: 3d3a3v3e3 = pass
- Input: ddddaaaavvvveeee = pass
- Input: dd3ddaa3aa3vv3vvee3ee = pass
Not Okay
- Input: 3ave = pass (I want this to fail)
Thank you for any help on the regular expression, it's much appreciated.
Without discussing if it's a good profanity filter (probably not!), the following regex will fulfill your spec:
d.*a.*v.*e
If '3' is the only 'special' character, then try this:
d3*a3*v3*e
This wont work.
For instance, your filter is going to block "firetruck" ;)
Someone could also just substitute a u
for a v
or a c
for a <
I don't know if there is a good way to build a profanity filter, other than to have a large white-list of known words and their misspellings.
Perhaps you should rethink why you want the profanity filter. If your 'customer' wants it, have them supply a list of words they want blocked, it's not your problem.
精彩评论