开发者

Trouble in regular expression

开发者 https://www.devze.com 2023-02-09 04:53 出处:网络
Matching G in \'Reference: G. \' using regular expression I tried using this but error still occur refresidue = re.compiler(r\'(s/Reference: \\ //n)\')

Matching G in 'Reference: G. ' using regular expression

I tried using this but error still occur

refresidue = re.compiler(r'(s/Reference: \ //n)')

Any other suggestions as I'm quite new in this. Any help is most appreciated.

'Reference: G. ' reference can be either A,C,G or T

I'm sorry about the confusion - what i would like to have is that the output only prints out the characters (A,C,G,T) instead of Reference: .

This is my code

refresidue = re.compiler(r'(s/Reference: \ //n)')

a_matchref = refresidue.search(row[2])

if a_matchref is not None:

   a_matchref = a_matchref.gr开发者_StackOverflow社区oup(1)


You're mixing regex syntax from JavaScript (or some other regex flavor) and Python; and the regex itself is also quite strange. Also, re.compile() compiles a regex, it doesn't match it to anything.

Assuming you want to match a single alphanumeric character after the text Reference:, try the following:

refresidue = re.search(r"Reference:\s*(\w)", your_text_to_be_matched).group(1)


Here's how I resolved the problem step-by-step. Even after several years of experience with regexp, some particular syntaxes always escapes my mind. At such times, it's best to start with a short expression which absolutely should match what you want.

Let's use the re module.

>>> import re

Now what is the error?

>>> refresidue = re.compiler(r'(s/Reference: \ //n)')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'compiler'

Ah, so what attributes does the re module have?

>>> dir(re)
['DEBUG', 'DOTALL', 'I', 'IGNORECASE', 'L', 'LOCALE', 'M', 'MULTILINE', 'S',
 'Scanner', 'T', 'TEMPLATE', 'U', 'UNICODE', 'VERBOSE', 'X', '_MAXCACHE', '__all__',
 '__builtins__', '__doc__', '__file__', '__name__', '__version__', '_alphanum',
 '_cache', '_cache_repl', '_compile', '_compile_repl', '_expand', '_pattern_type',
 '_pickle', '_subx', 'compile', 'copy_reg', 'error', 'escape', 'findall', 'finditer',
 'match', 'purge', 'search', 'split', 'sre_compile', 'sre_parse', 'sub', 'subn', 'sys',
 'template']

So it must be re.compile

>>> refresidue = re.compile(r'(s/Reference: \ //n)')(re)

Ok, compilation complete. Let's use it to match the string.

>>> refresidue.match('Reference: G')

Nothing? Strip down the expression then.

>>> refresidue = re.compile(r'Reference:')
>>> refresidue.match('Reference: G')
<_sre.SRE_Match object at 0x7fe14701f030>

Of course it should match. How about adding the G?

>>> refresidue = re.compile(r'Reference: G')
>>> refresidue.match('Reference: G')
<_sre.SRE_Match object at 0x7fe14701f098>

Yes. I want the whole alphabet please.

>>> refresidue = re.compile(r'Reference: [A-Z]')
>>> refresidue.match('Reference: G')
<_sre.SRE_Match object at 0x7fe14701f030>

I also want to single out the letter.

>>> refresidue = re.compile(r'Reference: ([A-Z])')
>>> refresidue.match('Reference: G')
<_sre.SRE_Match object at 0x7fe1470b9738>

No problem so far. So how do I get at the parenthesized part?

>>> dir(refresidue.match('Reference: G'))
['__copy__', '__deepcopy__', 'end', 'expand', 'group', 'groupdict', 'groups', 'span', 'start']

group sounds like it.

>>> refresidue.match('Reference: G').group   
<built-in method group of _sre.SRE_Match object at 0x7fe1470b9738>

So it's a method. Let's try calling it.

>>> refresidue.match('Reference: G').group(0)
'Reference: G'

How about this?

>>> refresidue.match('Reference: G').group(1)
'G'

There, the G.


I think this is what you after, but maybe you can add more examples about the kind of data your are matching-

import re
refresidue = re.compile(r'Reference: ([A-Z])')

You use the above like this:

>>>> refresidue.match("Reference: G").group(1)
'G'
0

精彩评论

暂无评论...
验证码 换一张
取 消