You indicated that you want to perform a static analysis to detect buffer overflows.
First, writing a grammar for C is harder than it sounds. Everything there is in the standard, and then there that the compilers really accept. And you must decide what to do with the preprocessor (and it depends on the compiler on the compiler!). If you do not get the correct grammar and preprocess, you will not be able to analyze real programs. (If you want to play in toy languages, thatβs fine, but then you donβt need C grammar).
To do the analysis, you will need much more machines than AST. You will need symbol tables, control and analysis of the data flow, probable local and global points - analysis, call graph extraction and some type of range analysis.
People just don't get it.
** OBTAINING A PASSIR - A LONG WAY FROM ANY USER WITH REAL LANGUAGES **
I scream because I see it again and again and again.
If you want to perform a specific task of analyzing or transforming a program, if you do not want to die of old age before you begin your task, you better find the foundation that you need most. The base of a squeaky grammar parser generator is not the foundation. (Don't get me wrong: ANTLR, YACC, JavaCC are all great parser generators, and they are great for creating a parser for a new language. They are great for implementing parsers for real langauges when creating an investment. But they produce parsers, and mostly people do not engage in the production part, and they do not provide additional equipment with a long shot.)
Our DMS Software Reengineering Toolkit contains all of the above mechanisms, because it is almost always necessary, and this is a royal headache for implementation. (My team has invested 15 years.)
We also created an instance of this mechanism, which is especially useful for COBOL and Java, C, C ++ (to a slightly lesser extent, the language is really difficult), in different dialects, so that others do not need to repeat this lengthy process.
GCC and Clang are quite mature for C and C ++ as alternatives.