Dynamically evaluating simple boolean logic in Python_问答_开发者

I've got some dynamically-generated boolean logic expressions, like:

(A or B) and (C or D)
A or (A and B)
A
empty - evaluates to True

The placeholders get replaced with booleans. Should I,

Convert this information to a Python expression like True or (True or False) and eval it?
Create a binary tree w开发者_C百科here a node is either a bool or Conjunction/Disjunction object and recursively evaluate it?
Convert it into nested S-expressions and use a Lisp parser?
Something else?

Suggestions welcome.

Here's a small (possibly, 74 lines including whitespace) module I built in about an hour and a half (plus almost an hour to refactoring):

str_to_token = {'True':True,
                'False':False,
                'and':lambda left, right: left and right,
                'or':lambda left, right: left or right,
                '(':'(',
                ')':')'}

empty_res = True


def create_token_lst(s, str_to_token=str_to_token):
    """create token list:
    'True or False' -> [True, lambda..., False]"""
    s = s.replace('(', ' ( ')
    s = s.replace(')', ' ) ')

    return [str_to_token[it] for it in s.split()]


def find(lst, what, start=0):
    return [i for i,it in enumerate(lst) if it == what and i >= start]


def parens(token_lst):
    """returns:
        (bool)parens_exist, left_paren_pos, right_paren_pos
    """
    left_lst = find(token_lst, '(')

    if not left_lst:
        return False, -1, -1

    left = left_lst[-1]

    #can not occur earlier, hence there are args and op.
    right = find(token_lst, ')', left + 4)[0]

    return True, left, right


def bool_eval(token_lst):
    """token_lst has length 3 and format: [left_arg, operator, right_arg]
    operator(left_arg, right_arg) is returned"""
    return token_lst[1](token_lst[0], token_lst[2])


def formatted_bool_eval(token_lst, empty_res=empty_res):
    """eval a formatted (i.e. of the form 'ToFa(ToF)') string"""
    if not token_lst:
        return empty_res

    if len(token_lst) == 1:
        return token_lst[0]

    has_parens, l_paren, r_paren = parens(token_lst)

    if not has_parens:
        return bool_eval(token_lst)

    token_lst[l_paren:r_paren + 1] = [bool_eval(token_lst[l_paren+1:r_paren])]

    return formatted_bool_eval(token_lst, bool_eval)


def nested_bool_eval(s):
    """The actual 'eval' routine,
    if 's' is empty, 'True' is returned,
    otherwise 's' is evaluated according to parentheses nesting.
    The format assumed:
        [1] 'LEFT OPERATOR RIGHT',
        where LEFT and RIGHT are either:
                True or False or '(' [1] ')' (subexpression in parentheses)
    """
    return formatted_bool_eval(create_token_lst(s))

The simple tests give:

>>> print nested_bool_eval('')
True
>>> print nested_bool_eval('False')
False
>>> print nested_bool_eval('True or False')
True
>>> print nested_bool_eval('True and False')
False
>>> print nested_bool_eval('(True or False) and (True or False)')
True
>>> print nested_bool_eval('(True or False) and (True and False)')
False
>>> print nested_bool_eval('(True or False) or (True and False)')
True
>>> print nested_bool_eval('(True and False) or (True and False)')
False
>>> print nested_bool_eval('(True and False) or (True and (True or False))')
True

[Partially off-topic possibly]

Note, the you can easily configure the tokens (both operands and operators) you use with the poor-mans dependency-injection means provided (token_to_char=token_to_char and friends) to have multiple different evaluators at the same time (just resetting the "injected-by-default" globals will leave you with a single behavior).

For example:

def fuzzy_bool_eval(s):
    """as normal, but:
    - an argument 'Maybe' may be :)) present
    - algebra is:
    [one of 'True', 'False', 'Maybe'] [one of 'or', 'and'] 'Maybe' -> 'Maybe'
    """
    Maybe = 'Maybe' # just an object with nice __str__

    def or_op(left, right):
        return (Maybe if Maybe in [left, right] else (left or right))

    def and_op(left, right):
        args = [left, right]

        if Maybe in args:
            if True in args:
                return Maybe # Maybe and True -> Maybe
            else:
                return False # Maybe and False -> False

        return left and right

    str_to_token = {'True':True,
                    'False':False,
                    'Maybe':Maybe,
                    'and':and_op,
                    'or':or_op,
                    '(':'(',
                    ')':')'}

    token_lst = create_token_lst(s, str_to_token=str_to_token)

    return formatted_bool_eval(token_lst)

gives:

>>> print fuzzy_bool_eval('')
True
>>> print fuzzy_bool_eval('Maybe')
Maybe
>>> print fuzzy_bool_eval('True or False')
True
>>> print fuzzy_bool_eval('True or Maybe')
Maybe
>>> print fuzzy_bool_eval('False or (False and Maybe)')
False

It shouldn't be difficult at all to write a evaluator that can handle this, for example using pyparsing. You only have a few operations to handle (and, or, and grouping?), so you should be able to parse and evaluate it yourself.

You shouldn't need to explicitly form the binary tree to evaluate the expression.

If you set up dicts with the locals and globals you care about then you should be able to safely pass them along with the expression into eval().

Sounds like a piece of cake using SymPy logic module. They even have an example of that on the docs: http://docs.sympy.org/0.7.1/modules/logic.html

I am writing this because I had a solve a similar problem today and I was here when I was looking for clues. (Boolean parser with arbitrary string tokens that get converted to boolean values later).

After considering different options (implementing a solution myself or use some package), I settled on using Lark, https://github.com/lark-parser/lark

It's easy to use and pretty fast if you use LALR(1)

Here is an example that could match your syntax

from lark import Lark, Tree, Transformer

base_parser = Lark("""
    expr: and_expr
        | or_expr
    and_expr: token
            | "(" expr ")"
            | and_expr " " and " " and_expr
    or_expr: token
            | "(" expr ")"
            | or_expr " " or " " or_expr
    token: LETTER
    and: "and"
    or: "or"
    LETTER: /[A-Z]+/
""", start="expr")


class Cleaner(Transformer):
    def expr(self, children):
        num_children = len(children)
        if num_children == 1:
            return children[0]
        else:
            raise RuntimeError()

    def and_expr(self, children):
        num_children = len(children)
        if num_children == 1:
            return children[0]
        elif num_children == 3:
            first, middle, last = children
            return Tree(data="and_expr", children=[first, last])
        else:
            raise RuntimeError()

    def or_expr(self, children):
        num_children = len(children)
        if num_children == 1:
            return children[0]
        elif num_children == 3:
            first, middle, last = children
            return Tree(data="or_expr", children=[first, last])
        else:
            raise RuntimeError()


def get_syntax_tree(expression):
    return Cleaner().transform(base_parser.parse(expression))

print(get_syntax_tree("A and (B or C)").pretty())

Note: the regex I chose doesn't match the empty string on purpose (Lark for some reason doesn't allow it).

You can perform that with Lark grammar library https://github.com/lark-parser/lark

from lark import Lark, Transformer, v_args, Token, Tree
from operator import or_, and_, not_

calc_grammar = f"""
    ?start: disjunction
    ?disjunction: conjunction
        | disjunction "or" conjunction   -> {or_.__name__}
    ?conjunction: atom
        | conjunction "and" atom  -> {and_.__name__}
    ?atom: BOOLEAN_LITTERAL           -> bool_lit
         | "not" atom         -> {not_.__name__}
         | "(" disjunction ")"
    BOOLEAN_LITTERAL: TRUE | FALSE
    TRUE: "True"
    FALSE: "False"
    %import common.WS_INLINE
    %ignore WS_INLINE
"""


@v_args(inline=True)
class CalculateBoolTree(Transformer):
    or_ = or_
    not_ = not_
    and_ = and_

    allowed_value = {"True": True, "False": False}

    def bool_lit(self, val: Token) -> bool:
        return self.allowed_value[val]


calc_parser = Lark(calc_grammar, parser="lalr", transformer=CalculateBoolTree())
calc = calc_parser.parse


def eval_bool_expression(bool_expression: str) -> bool:
    return calc(bool_expression)


print(eval_bool_expression("(True or False) and (False and True)"))
print(eval_bool_expression("not (False and True)"))
print(eval_bool_expression("not True or False and True and True"))