Are there any tools to parse the header file c and extract the protoype function from the header file c

In particular, getting the return type of the function (and, if possible, its type of pointer).

(I'm trying to write ioctl / dlsym wrapper libs auto-generation (for LD_PRELOAD ed)). It is recommended that you use the python or ruby ​​library, but any workable solution is welcome.

+6
source share
4 answers

I have successfully used the Haskells Language.C package from the hackage package (Haskells answer on CPAN) to do something like this. It will provide you with a complete syntax tree for the C file (or header), which can then be traversed to extract the necessary information. This should AFAIK also work with #include #define , etc.

I'm afraid that I do not have the appropriate software to test it, but it will look something like this:

 handler (DeclEvent (Declaration d)) = do let (VarDecl varName declAttr t) = getVarDecl d case t of (FunctionType (FunType returnType params isVaradic attrs)) -> do {- varName RETURNS returnType .... -} _ -> do return () return () handler _ = do return () main = do let compiler = newGCC "gcc" ast <- parseCFile compiler Nothing opts cFileName case (runTrav newState (withExtDeclHandler (analyseAST ast) handler)) of ... 

The above may look scary, but you probably won't need to have many more Haskell lines do what you want! I will gladly share the full source code that I used (~ 200 lines) if it can be useful.

+5
source

What you are looking for seems to be a way to easily generate an abstract arbitrary code c syntax tree . For this purpose (and if you are familiar with python), I would suggest using pycparser :

 parser = CParser() buf = ''' static void foo(int k) { j = p && r || q; return j; } ''' t = parser.parse(buf, 'x.c') t.show() 

generates:

 FileAST: FuncDef: Decl: foo, [], ['static'] FuncDecl: ParamList: Decl: k, [], [] TypeDecl: k, [] IdentifierType: ['int'] TypeDecl: foo, [] IdentifierType: ['void'] Compound: Assignment: = ID: j BinaryOp: || BinaryOp: && ID: p ID: r ID: q Return: ID: j 

Each compiler does this, and most of them provide api for access to their various parsing / semantic verification procedures. In addition, any commonly used parser generator must have grammars available for parsing c. If you are worried about performance and / or want to stay in c, I would suggest taking a look at:

  • clang : a fairly complete llvm C implementation that supports most gcc extensions. It is very easy to create AST from C code. You can either compile it as lib in clang, work directly with AST, or have a clang binary that issues them to stdout.
  • gcc (I would personally go with clang, much cleaner).
  • Antlr (a parser generator, many existing solutions for c float on the Internet).
+4
source

The cproto program does this. Please note that there are two separate versions:

Until recently, GCC included a protoize program that could perform this task (and convert the definitions of the K & R function to the definitions of a prototype ISO function); this is no longer part of the GCC distribution.

+3
source

Our DMS Software Reengineering Toolkit with its C Front End can easily do this.

DMS uses a language definition (in this case, the C language) to analyze the source code, builds an abstraction of the syntax trees, determines the types of expressions, and builds complete character tables. It can also print AST back to the actual langauge text (e.g. C code). You can easily find function declarations and collect everything you want from a character table entry for it ("is the return type a pointer?") And / or print the declaration as a prototype. You may find that you need to normalize characters if you want to print a prototype that is not really dependent on other definitions in the actual file; this requires creating an AST for various type declarations and replacing them with each other. In the past, we have done this for other customers, and this mechanism is available on the front panel of C.

+2
source

Source: https://habr.com/ru/post/890123/


All Articles