How should I start writing a parser for BibTex files. As the initial design I see following steps.
- List down grammar
- Build a tokenizer
- Do parsing of token stream against开发者_开发知识库 grammar
We also need some error mechanism, so the users uploading bibtex files can know line numbers where is the error in their BibTex files. I am looking for community opinion to target this problem.
(please point if there are any existing open source C# or VB.NET BibTex parsers.)
There are many tools available to assist you with this, such as ANTLR or the GOLD Parsing System. I usually use the latter to create my parser grammars.
I've published an open source library for BibTex format (load/save/export to Excel), allowing both non-typed (Key/Value dictionary) and strong typed access to the BibTex entries.
It might not fit well your purpose, as it is weak on validation (has none of it :) ), but might help anyway:
- Nuget Package
- GitHub repository
- About the package on my web site
精彩评论