开发者

Pyparsing problem with operators

开发者 https://www.devze.com 2023-02-12 00:40 出处:网络
开发者_如何学CI did a grammar with pyparsing, and I have a problem. The grammar tries to parse a search query (with operator precedence, parenthesis, etc), and I need for spaces to work like the and o
开发者_如何学C

I did a grammar with pyparsing, and I have a problem. The grammar tries to parse a search query (with operator precedence, parenthesis, etc), and I need for spaces to work like the and operator.

For example, this works fine:

(word and word) or word

But this fails:

(word word) or word

And I want the second query to works like the first one.

My actual grammar is:

WWORD = printables.replace("(", "").replace(")", "")
QUOTED = quotedString.setParseAction(removeQuotes)

OAND = CaselessLiteral("and")
OOR = CaselessLiteral("or")
ONOT = "-"

TERM = (QUOTED | WWORD)

EXPRESSION = operatorPrecedence(TERM,
    [
        (ONOT, 1, opAssoc.RIGHT),
        (OAND, 2, opAssoc.LEFT),
        (OOR, 2, opAssoc.LEFT)
    ])

STRING = OneOrMore(EXPRESSION) + StringEnd()


One way to address your problem is to define AND as an Optional operator. If you do this, you'll have to take extra care that real keywords like 'and' and 'or' aren't misinterpreted as search words. Also, with Optional, you can add a default string, so that even if the "and" is missing in the original search query, your parsed text will insert it for you (for easier post-parse processing).

from pyparsing import *

QUOTED = quotedString.setParseAction(removeQuotes)  
OAND = CaselessLiteral("and") 
OOR = CaselessLiteral("or") 
ONOT = Literal("-")
WWORD = ~OAND + ~OOR + ~ONOT + Word(printables.replace("(", "").replace(")", ""))
TERM = (QUOTED | WWORD)  
EXPRESSION = operatorPrecedence(TERM,
    [
    (ONOT, 1, opAssoc.RIGHT),
    (Optional(OAND,default="and"), 2, opAssoc.LEFT),
    (OOR, 2, opAssoc.LEFT)
    ])

STRING = OneOrMore(EXPRESSION) + StringEnd()

tests = """\
word and ward or wird
word werd or wurd""".splitlines()

for t in tests:
    print STRING.parseString(t)

Gives:

[[['word', 'and', 'ward'], 'or', 'wird']]
[[['word', 'and', 'werd'], 'or', 'wurd']]
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号