开发者

Scala: Using StandardTokenParser for parsing hexadecimal numbers

开发者 https://www.devze.com 2023-01-11 06:03 出处:网络
I am using Scala combinatorial parser by extending scala.util.parsing.combinator.syntactical.StandardTokenParser. This class provides following methods

I am using Scala combinatorial parser by extending scala.util.parsing.combinator.syntactical.StandardTokenParser. This class provides following methods

def ident : Parser[String] for parsing identifi开发者_开发问答ers and

def numericLit : Parser[String] for parsing a number (decimal I suppose)

I am using scala.util.parsing.combinator.lexical.Scannersfrom scala.util.parsing.combinator.lexical.StdLexicalfor lexing.

My requirement is to parse a hexadecimal number (without the 0x prefix) which can be of any length. Basically a grammar like: ([0-9]|[a-f])+

I tried integrating Regex parser but there are type issues there. Other ways to extend the definition of lexer delimiter and grammar rules lead to token not found!


As I thought the problem can be solved by extending the behavior of Lexer and not the Parser. The standard lexer takes only decimal digits, so I created a new lexer:

class MyLexer extends StdLexical {
  override type Elem = Char
  override def digit = ( super.digit | hexDigit )
  lazy val hexDigits = Set[Char]() ++ "0123456789abcdefABCDEF".toArray
  lazy val hexDigit = elem("hex digit", hexDigits.contains(_))
}

And my parser (which has to be a StandardTokenParser) can be extended as follows:

object ParseAST extends StandardTokenParsers{

  override val lexical:MyLexer = new MyLexer()
  lexical.delimiters += ( "(" , ")" , "," , "@")
  ...
 }

The construction of the "number" from digits is taken care by StdLexical class:

class StdLexical {
...

def token: Parser[Token] = 
    ...
| digit~rep(digit)^^{case first ~ rest => NumericLit(first :: rest mkString "")}
}

Since StdLexical gives just the parsed number as a String it is not a problem for me, as I am not interested in numeric value either.


You can use the RegexParsers with an action associated to the token in question.

import scala.util.parsing.combinator._

object HexParser extends RegexParsers {
  val hexNum: Parser[Int] = """[0-9a-f]+""".r ^^ 
           { case s:String => Integer.parseInt(s,16) } 

  def seq: Parser[Any] = repsep(hexNum, ",")

}

This will define a parser that reads comma separated hex number with no prior 0x. And it will actually return a Int.

val result = HexParser.parse(HexParser.seq, "1, 2, f, 10, 1a2b34d")
scala> println(result)
[1.21] parsed: List(1, 2, 15, 16, 27439949)

Not there is no way to distinguish decimal notation numbers. Also I'm using the Integer.parseInt, this is limited to the size of your Int. To get any length you may have to make your own parser and use BigInteger or arrays.

0

精彩评论

暂无评论...
验证码 换一张
取 消