I'm toying with Scala's Parser library. I am trying to write a parser for a format where a length is specified followed by a message of that len开发者_StackOverflow中文版gth. For example:
x.parseAll(x.message, "5helloworld") // result: "hello", remaining: "world"
I'm not sure how to do this using combinators. My mind first goes to:
def message = length ~ body
But obviously body depends on length, and I don't know how to do that :p
Instead you could just define a message Parser as a single Parser (not combination of Parsers) and I think that is doable (although I haven't looked if a single Parser can pull several elem?).
Anyways, I'm a scala noob, I just find this awesome :)
You should use into
for that, or its abbreviation, >>
scala> object T extends RegexParsers {
| def length: Parser[String] = """\d+""".r
| def message: Parser[String] = length >> { length => """\w{%d}""".format(length.toInt).r }
| }
defined module T
scala> T.parseAll(T.message, "5helloworld")
res0: T.ParseResult[String] =
[1.7] failure: string matching regex `\z' expected but `w' found
scala> T.parse(T.message, "5helloworld")
res1: T.ParseResult[String] = [1.7] parsed: hello
Be careful with precedence when using it. If you add an "~ remainder" after the function above, for instance, Scala will interpret it as length >> ({ length => ...} ~ remainder)
instead of (length >> { length => ...}) ~ remainder
This does not sound like a context free language, so you will need to use flatMap :
def message = length.flatMap(l => bodyOfLength(n))
where length is of type Parser[Int] and bodyOfLength(n) would be based on repN, such as
def bodyWithLength(n: Int) : Parser[String]
= repN(n, elem("any", _ => true)) ^^ {_.mkString}
I wouldn´t use pasrer combinators for this purpose. But if you have to or the problem becomes more complex you could try this:
def times(x :Long,what:String) : Parser[Any] = x match {
case 1 => what;
case x => what~times(x-1,what);
Don´t use parseAll if you want something remained, use parse. You could parse length, store the result in a mutable field x(I know ugly, but useful here) and parse body x times, then you get the String parsed and the rest remains in the parser.