开发者

How do we keep multiple semantic values during parsing with Happy/Haskell

开发者 https://www.devze.com 2023-01-09 04:46 出处:网络
I\'m trying to build a simple lexer/parser with Alex/Happy in Haskell, and I would like to keep some localisation information from the text file into my final AST.

I'm trying to build a simple lexer/parser with Alex/Happy in Haskell, and I would like to keep some localisation information from the text file into my final AST.

I managed to build a lexer using Alex that build a list of Tokens with localisation:

data Token = Token AlexPosn Foo Bar
lexer :: String -> [Token]

in my Happy file, when declaring the %token part, I can declare what are the semantic part of the token with the $$ symbol

%token FOO  { Token _ $$ _ }

and in the parsing rule, the $i will refer to 开发者_开发技巧this $$.

foo_list: FOO  { [$1] }
        | foo_list FOO { $2 : $1 }

Is there a way to refer to the AlexPosn part and to the Foo part of the FOO token ? Right now I only know how do refer to only one of them. I can find information on a way to ''add several $$'', and to refer to them afterwards.

Is there a way to do so ?

V.


In the end, I did find 2 solutions:

  • pack all the meaning data in a tuple, so that $$ point to this tuple, then extract the data by projection:

    data Token = Token (AlexPosn,Foo) Bar
    %token FOO { Token $$ some_bar }
    rule : FOO  { Ast (fst $1) (snd $1) }
    
  • do not use $$ at all: if you don't use $$, happy will give you the full token during the parsing, so it is up to you to extract what you really need from this token:

    data Token = Token AlexPosn Foo Bar
    %token FOO = { Token _ _ some_bar }
    rule : FOO  { Ast (get_pos $1) (get_foo $1) }
    
    get_pos :: Token -> AlexPosn
    get_foo :: Token -> Foo
    

    ...

I think the first one is the most elegant. The second one can be quite heavy in term of lines of code if you are carrying a lot of information: you will have to build "projections" by hand (pattern matching and so on), and doing so in a safe way can be tricky if your token type is quite big.


It is also possible to keep multiple values like this:

data Token = Token AlexPosn Foo Bar
%token FOO { Token pos foo some_bar }
rule : FOO { Ast pos foo }

Although I'm not sure if Happy actually guarantees that this will always work. The reason for why it (maybe) works is that happy will generate code that pattern matches on Token pos foo some_bar, making pos and foo available in Ast pos foo.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号