PostScript/PDF string literals are surrounded by parentheses, and are allowed to contain unescaped parentheses as long as the parentheses are fully balanced. So for instance
( () ) % valid string constant
( ( ) % invalid string constant, the inner ( should be escaped
I know an algorithm to tell me if there are any unbalanced parentheses in a string; what I'm looking for is an algorithm that will locate a minimal set of parentheses that are unbalanc开发者_运维问答ed, so that I can then stick backslashes in front of them to make the whole a valid string literal. More examples:
( ⟶ \(
() ⟶ ()
(() ⟶ \(() or (\()
()) ⟶ ()\) or (\))
()( ⟶ ()\(
A modification of the standard stack based algorithm to detect imbalanced parenthesis should work for you. Here's some pseudo code:
void find_unbalaned_indices(string input)
{
// initialize 'stack' containing of ints representing index at
// which a lparen ( was seen
stack<int index> = NIL
for (i=0 to input.size())
{
// Lparen. push into the stack
if (input[i] == '(')
{
// saw ( at index=i
stack.push(i);
}
else if (input[i] == ')')
{
out = stack.pop();
if (out == NIL)
{
// stack was empty. Imbalanced RParen.
// index=i needs to be escaped
...
}
// otherwise, this rparen has a balanced lparen.
// nothing to do.
}
}
// check if we have any imbalanced lparens
while (stack.size() != 0)
{
out = stack.pop();
// out is imbalanced
// index = out.index needs to be escaped.
}
}
Hope this helps.
def escape(s):
return ''.join(r(')(', r('()', s)))
def r(parens, chars):
return reversed(list(escape_oneway(parens, chars)))
def escape_oneway(parens, chars):
"""Given a sequence of characters (possibly already escaped),
escape those close-parens without a matching open-paren."""
depth = 0
for x in chars:
if x == parens[0]:
depth += 1
if x == parens[1]:
if depth == 0:
yield '\\' + x
continue
else:
depth -= 1
yield x
精彩评论