开发者

Simple Grammar for Lemon LALR Parser

开发者 https://www.devze.com 2023-03-22 06:01 出处:网络
I\'ve been stuck with this since a while now. I want to parse something as simple as: LIKES: word1 word2 .. wordN开发者_如何学C HATES: word1 word2 .. wordN

I've been stuck with this since a while now. I want to parse something as simple as:

LIKES: word1 word2 .. wordN开发者_如何学C HATES: word1 word2 .. wordN

I am using Lemon+Flex. At the moment my Grammar looks something like this :

%left LIKES MOODS FROM HATES INFO.

%syntax_error {  
  std::cout << "Syntax error!" << std::endl;  
}   

final ::= likes_stmt.
final ::= hates_stmt.

likes_stmt ::= LIKES list(A). { Data *data=Data::getInstance();data->likes.push_back(A);}
hates_stmt ::= HATES list(A). { Data *data=Data::getInstance();data->hates.push_back(A);}

list ::= likes_stmt VALUE(A).   { Data *data=Data::getInstance();data->likes.push_back(A);}
list ::= hates_stmt VALUE(A).   { Data *data=Data::getInstance();data->hates.push_back(A); }

list(A) ::= VALUE(B).           {A=B;}

But this only works for first 2 words. Clearly I am doing something wrong , probably in the recursive definition ? Any heads up is appreciated :)


@crozzfire, Ira provided correct answer for your original question, consider voting for it.

Let me answer to the question with you additional requirement to separate parsed values into two lists. Don't create different rules for parsing of these lists since the grammar of list is the same for both cases. What you need is a flag to indicate whether LIKES or HATES was found in front of list. The 4th parameter of Lemon's Parse function suits best for this needs. See "The Parser Interface" section of Lemon documentation.

Below is updated Ira's grammar that sets and check such flag variable. Take note that rules set_likes_state and set_hites_state need to be placed just before LIKES and HATES token to have associated action executed when tokens are reduced.

    %extra_argument {unsigned* state}

    final ::= likes_stmt.
    final ::= hates_stmt.

    likes_stmt ::= set_likes_state LIKES list(A).
    hates_stmt ::= set_hites_state HATES list(A).

    list ::= list VALUE(A).   { if (*state == 0) {/*add A to list1*/} else {/*add A to list2*/}; }
    list ::= VALUE(A).        { if (*state == 0) {/*add A to list1*/} else {/*add A to list2*/}; }

    set_likes_state ::= .     { *state = 0; }
    set_hites_state ::= .     { *state = 1; }


It looks to me that your likes_stmt is defined in terms of list, and list is defined in terms of likes. I'm surprised it works for any words at all. It could be that I don't understand LEMON syntax (I sure don't get the list(A) bit), but grammars BNFs tend to be pretty similar.

I'd expect your grammar to look more like:

 final = likes_stmt ;

 likes_stmt = LIKES list ;
 likes_stmt = HATES list ;


  list = value ;
  list = list value ;

Of course this would only recognize one LIKES phrase, or one HATES phrase, but not both that same time or in order as implied by line 2 of your question.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号