Can't pass value into re.findall (python)_问答_开发者

Can't pass value into re.findall (python)

开发者 https://www.devze.com 2023-03-15 06:02 出处：网络

Can anyone help me understand why this works... z = re.findall(r\'(fo开发者_JAVA技巧obar)\', string)

相关专题：python

Can anyone help me understand why this works...

z = re.findall(r'(fo开发者_JAVA技巧obar)', string)

But this doesn't?

regexStr = "r'(foobar)'"
z = re.findall(regexStr, string)

I've printed regexStr and determined that it's output is IDENTICAL to r'(foobar)'.

Can someone pls help? I've also tried escaping the apostrophes too.

The "r" modifier on string should be outside of the quotes

regexStr = r'(foobar)'

From the docs - "String literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and use different rules for interpreting backslash escape sequences."

A solution to your problem is

regexStr = r'(%s)' % searchString

where searchString will replace %s

In Python it is often better to use this construct than regular concatenation. (meaning str1 + str2 + ... ) Especially as you don't have to care about converting ints doubles and so on.

More on the subject here: 3.5. Formatting Strings

The r should not be part of the string, it only tells the python interpreter what kind of string it is:

r('hello\n')  # Raw string => (hello\n)
u'unicodestring'

The r modifier builds a raw string. It tells Python not to convert backslash escape sequences to special characters, such as \t or \n, for built-in strings. It has nothing to do with regular expression escape sequences.

>>> len('\t')  # tab character only
1
>>> len(r'\t') # backslach character followed by a tee
2

However, regular expression syntax has its own set of escaping rules which often collide with the escape rules of built-in Python strings. The r keyword helps us only deal with one. For example the first string below is a regular expression text that matches word characters, so is the second one because Python converts \\ to \ for built-in strings unless the r keyword is provided.

>>> re.compile(r'\w') == re.compile('\\w')
True

In your case r'(foobar)' is exactly equivalent to '(foobar)' because there is no backslash sequence to escape. This string is a regular expression text only in your mind. The r keyword does not tell Python to interpret the string that way. Python only knows about regular expression objects which you build with re.compile() or which are inherently compiled with methods like re.findall().

Compiling regular expression objects have its own set of rules different from built-in string escape sequence rules and regex rules are not related to the r keyword. The regular expression compiler does not understand its special meaning, only the Python interpreter does.

In your case the r keyword in "r'(foobar)'" has no special treatment, because it is directly passed to the regex compiler. You are effectively building a regular expression that searches a string beginning with ", followed by r and so on. That's why two expressions differ.

>>> re.compile(r'(foobar)') == re.compile('"r(foobar)"') # your expressions
False

The usage of the r keyword has no effect here:

>>> re.compile(r'(foobar)') == re.compile('(foobar)')
True
>>> re.compile(r'"r(foobar)"') == re.compile('"r(foobar)"')
True

For more information:

http://docs.python.org/reference/lexical_analysis.html#string-literals
http://docs.python.org/library/re.html#regular-expression-syntax

Can't pass value into re.findall (python)

精彩评论

关注公众号

热门标签

图文推荐

Can't pass value into re.findall (python)

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：