开发者

Are there any tools for parsing a c header file and extract a function protoype from a c header file

开发者 https://www.devze.com 2023-03-11 20:56 出处:网络
Especially getting the function return type(and if possible whether its a pointer type). (I\'m trying to write auto generation of ioctl/dlsym wrapper libs(to be LD_PRELOAD ed)). A python or ruby libr

Especially getting the function return type(and if possible whether its a pointer type).

(I'm trying to write auto generation of ioctl/dlsym wrapper libs(to be LD_PRELOAD ed)). A python or ruby library would be prefe开发者_如何转开发rred but any workable solution is welcome.


I have successfully used Haskells Language.C package from hackage (Haskells answer to CPAN) to do something similar. It will provide you with a complete parse tree of the C (or header) file which can then be traversed to extract the needed information. It should AFAIK also work with #includes #defines and so on.

I'm afraid I don't have the relevant software installed to test it, but it would go something like this:

handler (DeclEvent (Declaration d)) =
do
let (VarDecl varName declAttr t) = getVarDecl d
case t of 
     (FunctionType (FunType returnType params isVaradic attrs)) -> 
        do {- varName RETURNS returnType .... -}
         _ -> do return ()
    return ()
handler _ = 
    do return ()

main = do    
    let compiler = newGCC "gcc"
    ast <- parseCFile compiler Nothing opts cFileName
    case (runTrav newState (withExtDeclHandler (analyseAST ast) handler)) of
        ...

The above might look scary, but you probably won't be needing that many more lines of Haskell to do what you want! I'll gladly share the complete source code I used (~200 lines) if it can be of any help.


The cproto program does this. Note that there are two separate versions:

  • 4.6 based at SourceForge, and
  • 4.7j based at Freshmeat

Up until recently, GCC included a program protoize that could do that job (and convert K&R function definitions to ISO prototyped function definitions); that is no longer part of the GCC distribution, though.


What you're looking for, it seems, is a way to easily generate the Abstract Syntax Tree of arbitrary c code. To this end (and if you're familiar with python), I'd suggest using pycparser:

parser = CParser()

buf = '''
  static void foo(int k)
  {
      j = p && r || q;
      return j;
  }
'''

t = parser.parse(buf, 'x.c')
t.show()

generates:

FileAST:
  FuncDef:
    Decl: foo, [], ['static']
      FuncDecl:
        ParamList:
          Decl: k, [], []
            TypeDecl: k, []
              IdentifierType: ['int']
        TypeDecl: foo, []
          IdentifierType: ['void']
    Compound:
      Assignment: =
        ID: j
        BinaryOp: ||
          BinaryOp: &&
            ID: p
            ID: r
          ID: q
      Return:
        ID: j

Every compiler does this, and most provide an api for accessing their various parsing/semantic checking routines. Also, any commonly-used parser generator should have grammars available for parsing c. If you're concerned about performance and/or want to stay within c, I'd suggest taking a look at:

  • clang: a fairly complete C implementation on the llvm architecture, supporting most gcc extensions. Very easy to generate ASTs from C code. You could either compile in clang as a lib and work with the ASTs directly, or have the clang binary dump them out to stdout.
  • gcc (I'd personally go with clang; much cleaner).
  • Antlr (A parser generator; many existing solutions for c are floating around the internet).


Our DMS Software Reengineering Toolkit with its C Front End would easily be able to do this.

DMS uses a language definition (in this case, the C language) to parse source code, builds absttract syntax trees, determine types of expression and build complete symbol tables. It can also prettyprint ASTs back to valid langauge text (e.g., C code). You can easily locate function declarations, and collect whatever you want from the symbol table entry for it ("is the return type a pointer?"), and/or print the declaration as a prototype. You may find you need to normalize symbols if you want to print out a prototype that is actually not dependent on other definitions in the actual file; this requires building the AST for various type declarations and substituting them in one another. We have done this for other customers in the past, and this machinery is available in the C Front End.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号