开发者

Can't match single character in Alex grammar

开发者 https://www.devze.com 2023-03-18 06:12 出处:网络
I finally got back to fleshing out a GitCommit message mode that I want to add to YI but I seem to missing something basic.I can\'t seem to match a single character in a grammar, all my rules only wor

I finally got back to fleshing out a GitCommit message mode that I want to add to YI but I seem to missing something basic. I can't seem to match a single character in a grammar, all my rules only work if they match the entire line. I know this has to be possible because other grammars in YI obviously do this but doing the same thing doesn't seem to work.

I want to have a commit mode that eventually looks very similar to the one in vim. One of the things that's useful in vim's mode is the keyword highlighting inside comments. Git puts a bunch of information inside comments in most everything it does (commit, rebase, etc.) so this is useful. My thinking was match the starting '#' character in git comments and switch to a different context that will match keywords. However I can't seem to make a rule that matches just the '#', the rule switches to comment style on lines that only contain a '#' but on lines that contain anything after the '#' it does not switch styles.

What I have right now is:

<0> {
\#                             { m (const $ LineComment) Style.commentStyle }
$commitChars*$                 { c Style.defaultStyle }
}

<lineComment> {                                                                                                    
$nl                            { m (const Digest) Style.defaultStyle }                                               
·                              { c Style.regexStyle }                                                                
}      

Details omitted obviously. The idea is to switch to 'lineComment' mode when we see a '#' and style things differently until we see the end of the line. According to the documentation and examples there should be a w开发者_Python百科ay to do what I want. I've tried pretty much every permutation I can think of for the '#' pattern but nothing changes the behavior I'm seeing.

What obvious thing am I missing?

Edit: The above code is from the implementation inside my YI branch. I have a standalone parser that exhibits the same problem here. If you run alex GitCommit.x && ghc --make GitCommit.hs && ./GitCommit < shortmsg you will see comment lines with content parsed as MessageLine and empty comment lines correctly marked CommentStart.


Okay I finally figured this out. It looks like Alex always takes the longest match, not the first match. The rule for matching commit lines will always be longer since it matches the whole line. This causes Alex to always choose that branch over the comment branch. Quoting from the Alex docs

When the input stream matches more than one rule, the rule which matches the longest prefix of the input stream wins. If there are still several rules which match an equal number of characters, then the rule which appears earliest in the file wins.

I guess I should have read the docs more than once. The solution is to remove the '#' from the $commitChars character set.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号