开发者

Regex for search queries

开发者 https://www.devze.com 2022-12-15 07:47 出处:网络
I have page designed in Django that has its own search engine. What I need help with is construction of regex that will filter only valid queries, which are consisting only of polish alphabet letters(

I have page designed in Django that has its own search engine. What I need help with is construction of regex that will filter only valid queries, which are consisting only of polish alphabet letters(both开发者_Go百科 upper- and lowercase) and symbols * and ? , can anyone be of assistance?

EDIT: I tried something like that:

query_re = re.compile(r'^\w*[\*\?]*$', re.UNICODE)
if not query_re.match(self.cleaned_data['query']):
    raise forms.ValidationError(_('Illegal character'))

but it also allows some invalid characters from different alphabets and wont allow *somest?ing* queries.


If your locale is correctly set, you would use

query_re = re.compile(r'^[\w\*\?]*$', re.LOCALE|re.IGNORECASE)

\w matches all locale-specific alphanumerics: http://docs.python.org/library/re.html


Try something like

regex = r'(?iL)^[\s\*\?a-z]*$'

assuming your machine's locale is Polish. The first part (?iL) sets the locale and ignorecase flags. The ^ matches the start of the string, \s matches any whitespace, and a-z any lowercase letter (or uppercase, thanks to the ignorecase flag).

Alternatively, instead of using (?L) and a-z, you could just explicitly list the allowable letters (e.g. abcdefghijklmnopqrstuvwxyz).

0

精彩评论

暂无评论...
验证码 换一张
取 消