开发者

Parsec - 'many' and error messages

开发者 https://www.devze.com 2023-01-25 16:23 出处:网络
When I try to parsemany p, I don\'t receive the \'expecting p\' message: > parse (many (char \'.\') >> eof) \"\" \"a\"

When I try to parse many p, I don't receive the 'expecting p' message:

> parse (many (char '.') >> eof) "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting end of input

Compare to

> parse (sepBy (char '.') (char ',') >> eof) "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting "." or end of input

which reports "." as I'd expect. many1 p <|> return [] works as well.

All of these functions accept empty input, so why doesn't many report what it's expecting? Is 开发者_如何学Goit a bug or a feature?


You'll get better error messages with manyTill:

> parse (manyTill (char '.') eof) "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting end of input or "."

This is just due to the way you chain with >>. If the first parser succeeds, then the second one will be run. many succeeds, so eof is tried. eof fails so you only get eof's error message.

With manyTill, it tries both parsers (the second first) and, if both fail, the error messages are combined (this is because it uses <|> internally).

On the whole, though, it's easier to define your own errors with <?>:

> parse (many (char '.') >> eof <?> "lots of dots") "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting lots of dots


In a somewhat superficial sense, the reason for the difference in behavior is that many is a primitive parser whereas sepBy is constructed in a similar manner to your reimplemented many. In the latter case, the "expecting..." message is constructed based on alternatives that were available along the path that led to the parse failure; with many there were no such choices, it merely succeeded unconditionally.

I don't know that I'd describe this as either a bug or a feature, it's just sort of a quirk of how Parsec works. Error handling is not really Parsec's strength and this really doesn't seem like the first thing I'd worry about in that regard. If it bothers you sufficiently you may be better served by looking into other parsing libraries. I've heard good things about uu-parsinglib, for instance.


From haddock

many p applies the parser p zero or more times. Returns a list of the returned values of p.

So empty string is a valid input for many combinator.

[Added]

Ah, now I see your point. expecting a or b is reported when <|> (choice combinator) is used. many is implemented without using <|>, but sepBy uses it internally.


This is a bug introduced in parsec-3.1. If you test with prior versions you should get an error message like this:

> parse (many (char '.') >> eof) "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting "." or end of input

At least, that's what I get after fixing the bug :-)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号