开发者

Translation from Python to CIL(C Intermediate Language)

开发者 https://www.devze.com 2023-02-28 14:22 出处:网络
I have worked on the static analysis on Python source code recently. There is already a static analyzer written in Ocaml for CIL(C Intermediate Language) in our group. We want to reuse this analyzer,

I have worked on the static analysis on Python source code recently. There is already a static analyzer written in Ocaml for CIL(C Intermediate Language) in our group. We want to reuse this analyzer, so our ideal approach is to translate Python to CIL.

Currently, I use Python built-in ast module to parse Python to Python AST. And then I translate the Python AST that ast.dump printed to C AST. In consider of C AST to CIL API and the static analyzer all written in Ocaml. I choose Ocamllex&Ocamlyacc to parse Python AST to C AST. However, there are some big problems.

The AST 开发者_高级运维representation which ast.dump printed is hard to identify. That make my parser not easy to implement. On the other hand, I can't use Ocaml to acess the Python ast internal structure. Even I could, the data structure is different from Ocaml.

I wonder whether I choose a wrong approach on the translation from Python code to C AST at first? Is there any other existing tools or approaches that may meet my requirements?

If there is anything I miss, please just point out that will be a lot of help for me. Thanks.


I don't think this is going to work very well. CIL is essentially just the C langauge. For your trick to work, you have translate Python completely to C... but the langauges have very dissimilar concepts. How will you model Python objects? Continuations? Dynamic load? Runtime typing? Infinite precision arithmetic? I think your problems are not the AST; rather they are conceptual.

If you could translate to CIL, you'd now have a new problem. Analyzers are easier to build when the constructs they need to find are easily detected. Once you translate you continuation to C, reasoning about interactions with continuations will be hard, because they won't be easy to recognize.

I think I'd spend my energy trying to build a Python static analyzer where the Python concepts were easy to detect.

0

精彩评论

暂无评论...
验证码 换一张
取 消