开发者

Matching (and replacing) groups of text within a string

开发者 https://www.devze.com 2022-12-31 20:05 出处:网络
I have string in the following format: blah IIF((doc.PostTypeId == 1), 开发者_高级运维(Parse(doc.Id) > 1000), (doc.ParentId > 1000)) blah

I have string in the following format:

blah IIF((doc.PostTypeId == 1), 开发者_高级运维(Parse(doc.Id) > 1000), (doc.ParentId > 1000)) blah

and I want to convert it to:

blah (doc.PostTypeId == 1 ? Parse(doc.Id) > 1000 : doc.ParentId > 1000) blah

So far I'm using the following regex string for the match

IIF\((?<one>[^,]*?),\ (?<two>[^,]*?),\ (?<three>[^,]*)\)

But I'm having problems balancing all the brackets, is there a better way or is regex the wrong tool for this?


In general regular expressions can't deal with balancing parentheses correctly, because to do so requires counting the depth of the nesting, which could be arbitrarily deep, and regular expressions can only store a finite amount of state (generally speaking).

That said, I'm going to assume that the changes you're making don't have parentheses nested more deeply than, say, three or four deep - in which case it becomes possible. Here's how to build it up:

It's easy to match a sequence with no parentheses at all:

EXPR0:  [^()]*

We can use that to create a regular expression that matches a single non-nested expression in parentheses:

PAREN1:   \(EXPR0\)

What abut an expression containing up to one level of parentheses? Well, that's just a mixture of PAREN1s with non-parenthesis characters:

EXPR1:    (?:PAREN1|EXPR0)*

given that, we can of course match a balanced expression in parentheses with up to one level of nesting:

PAREN2:    \(EXPR1\)

which we can extend to match any balanced expression with no more than two levels of () in the same way

EXPR2: (?:PAREN2|EXPR0)*

and so on:

PAREN3:    \(EXPR2\)
EXPR3:     (?:PAREN3|EXPR0)
PAREN4:    \(EXPR3\)
...

You can then use this to construct the match for the replacement you want to do - something along the lines of:

IIF\(?<one>EXPR5),(?<two>EXPR5),(?<three>EXPR5)\)

(actually you'll need to tweak things so that the EXPR5 expressions don't match unparenthesised commas, but it should be clear enough how to do that I hope :)

Of course, it's worth writing a short throwaway program to generate the required r.e. rather than constructing it manually!

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号