I've been looking to recognise a language which does not fit the general Flex/Bison paradigm. It has completely different token rules depending on seman开发者_StackOverflow中文版tic context. For example:
main() {
batchblock
{
echo Hello World!
set batchvar=Something
echo %batchvar%
}
}
Bison apparently supports recognition of these types of grammars, but it needs "Lexical Tie Ins" to support them effectively. It provides an interface for doing this -- but I'm confused as to how exactly I can supply different flex regexes depending on the context -- if this is even possible.
Thanks in advance :)
I'm confused as to how exactly I can supply different flex regexes depending on the context
Flex has a state mechanism whereby you can switch it between different sets of regexes. The syntax for this is
%x name_of_state
at the top of the file (after the %}
) and in your matching rules (after the first %%
)
<name_of_state> *regex goes here*
Then this regex is only matched when in that state. There is also a global state <*>
which can be used to match anything in any state.
There is more than one way to change states. For example, yy_pop_state
and yy_push_state
if you want to keep a stack of states. Alternatively you can use BEGIN(name_of_state)
. To go back to the initial state, use BEGIN(INITIAL)
.
As it stands, specifically, if your special block is consistently signaled by 'batchblock {', this can be handled entirely inside of flex -- on the Bison (or byacc, if you want to make your life at least a little easier) side, you'd just see tokens that changed to something like 'BATCH_ECHO'.
To handle it inside of flex, you'd use its start conditions capability:
%x batchblock
%%
"batchblock"{ws}\{ { BEGIN(batchblock); }
<batchblock>echo { return BATCH_ECHO; }
<batchblock>set { return BATCH_SET; }
/* ... */
<batchblock>\} { begin(INITIAL); }
The patterns that start with <batchblock>
can only match in the "batchblock" state, which is entered by the BEGIN(batchblock);
.
精彩评论