开发者

How do I handle newlines in a Bison Grammar, without allowing all characters?

开发者 https://www.devze.com 2023-02-08 03:37 出处:网络
I\'ve gone right back to basics to try and understand how the parser can match an input line such as \"asdf\", or any other jumble of characters, where there is no rule defined for this.

I've gone right back to basics to try and understand how the parser can match an input line such as "asdf", or any other jumble of characters, where there is no rule defined for this.

My lexer:

%{
    #include
%}
%%
"\n" {return NEWLINE; }

My Parser:

%{
    #include <stdlib.h>
%}
% token NEWLINE

%%

program:
| program line
;
line: NEWLINE
;

%%

#include <stdio.h>
int yyerror(char *s)
{
    printf("%s\n", s);
    return(0);
}
int main(void)
{
    yyparse();
    exit(0);
}

It is my understanding that this, when compiled and run should accept nothing more than empty b开发者_开发知识库lank lines, but it will also allow any strings to be input without a syntax error.

What am I missing?

Thanks


Currently, your lexer echos and ignores all non-newline characters (that's the default action in lex for characters that don't match any rule), so the parser will only ever see newlines.

In general, your lexer needs to do something with any/every possible input character. It can ignore them (silently or with a message), or return tokens for the parser. The usual approach is to have the last lexer rule be:

.         return *yytext;

which matches any single character (other than a newline) and sends it on to the parser as-is. This is the last rule, so that any earlier rule that matches a single character takes precedence,

This is completely independent of the parser, which only sees that part of the input the lexer gives it.


You have default rules. Add the option nodefault in order to solve your problem. Your lexer will then look like this instead:

%option nodefault
%{
    #include <stdlib.h>
%}
%%
"\n" {return NEWLINE; }
0

精彩评论

暂无评论...
验证码 换一张
取 消