Friends: in PostgreSQL plpython, am trying to do an iterative search/replace in a text block 'data'.
Using re-sub to define a match pattern, then call a function 'replace' to do the work. Objective is to have the 'replace' function called repeatedly, as some replacements generate further 'rule' matches, which require further replacements.
All works well through many, many replacements - and I'm managing to trigger the 2nd Pass of the repeat loop. Then, until something causes the Regex pattern to return an integer(?) -- apparently at the point it finds no matches... ?? I've tried testing for 'None' and '0', with no luck. Ideas?
data = (a_huge_block of_text)
# ====================== THE FUNCTION ==============
def replace(matchobj):
tag = matchobj.group(1)
plpy.info("-------- matchobj.group(1), tag: ", tag)
if matchobj.group(1) != '':
(do all the replacement work in here)
# ====================== END FUNCTION ==============
passnumber = 0
# If _any_ pattern m开发者_高级运维atch is found, process all of data for _all_ matches:
while re.search('(rule:[A-Za-z#]+)', data) != '':
# BEGIN repeat loop:
passnumber = passnumber + 1
plpy.info(' ================================ BEGIN PASS: ', passnumber)
data = re.sub('(rule:[A-Za-z#]+)', replace, data)
plpy.info(' =================================== END PASS: ', passnumber)
Above code seems to be running OK, into a second iteration... then:
ERROR: TypeError: sequence item 21: expected string, int found
CONTEXT: Traceback (most recent call last):
PL/Python function "myfunction", line 201, in <module>
data = re.sub('(rule:[A-Za-z#]+)', replace, data)
PL/Python function "myfunction", line 150, in sub
PL/Python function "myfunction"
Have also tried re.search (...) != '' -- and re.search (...) != 'None' --- with same result. I do realize I must find the syntax to represent the match object in some readable form...
The answer to this turned out to be quite simple, of course, once you know Python! (I don't!)
To initiate the repeat loop, I had been doing this test:
while re.search('(rule:[A-Za-z#]+)', data) != '':
Had also tried this one, which will also not work:
while re.search('(rule:[A-Za-z#]+)', data) != 'None':
The None result can be trapped, of course, but the quotes are not needed. It's as simple as that:
while re.search('(rule:[A-Za-z#]+)', data) != None:
It's all so simple, once you know!
精彩评论