开发者

Scala parser combinators: how to parse "if(x)" if x can contain a ")"

开发者 https://www.devze.com 2022-12-31 03:37 出处:网络
I\'m trying to get this to work: def emptyCond: Parser[Cond] = (\"if\" ~ \"(\") ~> regularStr <~ \")\" ^^ { case s => Cond(\"\",Nil,Nil) }

I'm trying to get this to work:

def emptyCond: Parser[Cond] = ("if" ~ "(") ~> regularStr <~ ")" ^^ { case s => Cond("",Nil,Nil) }

where regularStr is defined to开发者_JAVA技巧 accept a number of things, including ")". Of course, I want this to be an acceptable input: if(foo()). But for any if(x) it is taking the ")" as part of the regularStr and so this parser never succeeds.

What am I missing?

Edit:

regularStr is not a regular expression. It is defined thus:

  def regularStr = rep(ident | numericLit | decimalLit | stringLit | stmtSymbol) ^^ { case s => s.mkString(" ") }

and the symbols are:

  val stmtSymbol = "*" | "&" | "." | "::" | "(" | ")" | "*" | ">=" | "<=" | "=" | 
               "<" | ">" | "|" | "-" | "," | "^" | "[" | "]" | "?" | ":" | "+" |
               "-=" | "+=" | "*=" | "/=" | "&&" | "||" | "&=" | "|="

I don't need exhaustive language check, just the control structures. So I don't really care what's inside "()" in if(), I want to accept any sequence of identifiers, symbols, etc. So, for my purposes even if())) should be valid, where "))" is the if's "condition".


A regular expression cannot recognize a language that has nested, balanced constructs such as (...), [...], {...}, etc. So you're going to need to use further context-free productions (not regular expressions) to match the regularStr portions.


OK, accepting if())) was not really a requirement, just an example of what I would be willing to accept in order to make my parsing as cheap as possible, to just worry about capturing control structures.

However it appears I can't be so cheap and still have it work. So, since the if() construct has parenthesis, all I have to do is expect what's inside to have well balanced parenthesis. A closing ")" where one isn't expected cannot be part of the condition.

I did this:

  val regularNoParens = ident | numericLit | decimalLit | stringLit | stmtSymbol 
  def regularParens: Parser[String] = "(" ~ rep(regularNoParens | regularParens) ~ ")" ^^ { case l ~ s ~ r => l + s.mkString(" ") + r } 
  def regularStr = rep(regularNoParens | regularParens) ^^ { case s => s.mkString(" ") }

And I took out "(" and ")" from stmtSymbol. Works!

Edit: it didn't support nesting, fixed it.

0

精彩评论

暂无评论...
验证码 换一张
取 消