What you are looking for seems to be a way to easily generate an abstract arbitrary code c syntax tree . For this purpose (and if you are familiar with python), I would suggest using pycparser :
parser = CParser() buf = ''' static void foo(int k) { j = p && r || q; return j; } ''' t = parser.parse(buf, 'x.c') t.show()
generates:
FileAST: FuncDef: Decl: foo, [], ['static'] FuncDecl: ParamList: Decl: k, [], [] TypeDecl: k, [] IdentifierType: ['int'] TypeDecl: foo, [] IdentifierType: ['void'] Compound: Assignment: = ID: j BinaryOp: || BinaryOp: && ID: p ID: r ID: q Return: ID: j
Each compiler does this, and most of them provide api for access to their various parsing / semantic verification procedures. In addition, any commonly used parser generator must have grammars available for parsing c. If you are worried about performance and / or want to stay in c, I would suggest taking a look at:
- clang : a fairly complete llvm C implementation that supports most gcc extensions. It is very easy to create AST from C code. You can either compile it as lib in clang, work directly with AST, or have a
clang binary that issues them to stdout. - gcc (I would personally go with clang, much cleaner).
- Antlr (a parser generator, many existing solutions for c float on the Internet).
Raeez source share