For Java, see What will the AST (Abstract Syntax Tree) look like for an object-oriented programming language?
For C, see getting humanoid AST from C ++ code
Both of them are made by one engine: our DMS Software Reengineering Toolkit. DMS also has a complete C ++ 11 parser that can create similar XML. (EDIT Jan 2016: now full C ++ 14 for GCC and Visual C ++).
I donβt think XML is a really good idea: it is huge and klunky, and the analysis tools you can bring to it are ... what? XSLT: This is not very useful for program analysis. Read the XML in the DOM and get over it? You will find that you lack useful support (character tables, etc.); AST is simply not enough. See My essay on life after parsing (check out my bio or google).
You are better off using a set of integrated mechanisms that provides all kinds of consistent support for the analysis of (multiple) programming languages ββ(using AST as the basis). This is what DMS is designed to do.
source share