开发者

What are resources for building a static analyzer for C in C?

开发者 https://www.devze.com 2023-02-09 12:19 出处:网络
I have a school project to develop a static analyzer in C for C. Where should I start? What are some resources which could assist me?

I have a school project to develop a static analyzer in C for C.

Where should I start? What are some resources which could assist me?

I am assuming I will need to parse C, so what are some good parsers for C or tools for build开发者_StackOverflow中文版ing C parsers?


I would first take yourself over to antlr, look at its getting started guide, it has a wealth of information about parsing etc.., I personally use antlr as it gives a choice of code generation targets.

To use antlr you need a c or c++ grammar file, pick of these up and start playing.

Anyway have fun with it..


Probably your best starting point would by Clang (with the proviso that it already has a static analyzer, so unless you want to write one for its own sake, you might be better off using/enhancing the existing one).


Are you sure that you want to write the analyzer in C?

If you were using a modern langauge (e.g. C#, Java, Python), then I would second spgennard's suggestion of ANTLR for the parser.

If writing the analyzer in C is a requirement then you are stuck with lex and yacc (flex and bison) or maybe a hand-crafted parser.

Looks like Uno comes close to what you want to do. It uses lex/yacc and includes the grammar files. The analysis part however is written in C++.

Maybe you can get some more ideas about the how and what from tools listed at SpinRoot. Wikipedia also has some good info.


Parsing is the easiest and least important part of a static analyser. Antlr was already suggested, it should be sufficient for parsing plain C (but not C++). Just a little tip - do not implement your own preprocessor, better reuse the output of gcc -E.

As for the rest, you can take a look at some of the existing analysers sources, namely Clang and CIL, read about an SSA representation and abstract interpretation. Choosing the right intermediate representation for your code is a key.

I doubt it can be an easy task in plain C, so you'd probably end up implementing some sort of DSL on top of it to handle ASTs and transforms. Sounds like something much bigger than a typical school project.

0

精彩评论

暂无评论...
验证码 换一张
取 消