开发者

Algorithm to locate unbalanced parentheses in a string

开发者 https://www.devze.com 2023-02-02 21:11 出处:网络
PostScript/PDF string literals are surrounded by parentheses, and are allowed to contain unescaped parentheses as long as the parentheses are fully balanced.So for instance

PostScript/PDF string literals are surrounded by parentheses, and are allowed to contain unescaped parentheses as long as the parentheses are fully balanced. So for instance

( () )  % valid string constant
( ( )   % invalid string constant, the inner ( should be escaped

I know an algorithm to tell me if there are any unbalanced parentheses in a string; what I'm looking for is an algorithm that will locate a minimal set of parentheses that are unbalanc开发者_运维问答ed, so that I can then stick backslashes in front of them to make the whole a valid string literal. More examples:

(     ⟶   \(
()    ⟶   ()
(()   ⟶   \(() or (\()
())   ⟶   ()\) or (\))
()(   ⟶   ()\(


A modification of the standard stack based algorithm to detect imbalanced parenthesis should work for you. Here's some pseudo code:

void find_unbalaned_indices(string input)
{
    // initialize 'stack' containing of ints representing index at
    // which a lparen ( was seen

    stack<int index> = NIL          

    for (i=0 to input.size())
    {
        // Lparen. push into the stack
        if (input[i] == '(')
        {
            // saw ( at index=i
            stack.push(i);
        }
        else if (input[i] == ')')
        {
           out = stack.pop();
           if (out == NIL)
           {
               // stack was empty. Imbalanced RParen.
               // index=i needs to be escaped
               ... 
           }  
           // otherwise, this rparen has a balanced lparen.
           // nothing to do.
        }
    }

    // check if we have any imbalanced lparens
    while (stack.size() != 0)
    {
        out = stack.pop();
        // out is imbalanced
        // index = out.index needs to be escaped.
    }
}

Hope this helps.


def escape(s):
    return ''.join(r(')(', r('()', s)))

def r(parens, chars):
    return reversed(list(escape_oneway(parens, chars)))

def escape_oneway(parens, chars):
    """Given a sequence of characters (possibly already escaped),
    escape those close-parens without a matching open-paren."""
    depth = 0
    for x in chars:
        if x == parens[0]:
            depth += 1
        if x == parens[1]:
            if depth == 0:
                yield '\\' + x
                continue
            else:
                depth -= 1
        yield x
0

精彩评论

暂无评论...
验证码 换一张
取 消