Not familiar with Ragel, but some custom parsers and scanners did.
Your question seems to be more related to finding keywords than finding common identifiers.
You have rules telling Ragel to determine when a section is a number, the keyword "return", a semicolon, the keyword "returns", an identifier, etc. Altought, you can make a rule for each keyword, I will not recommend it.
What I learned from experience is that it’s better to read all the explication keywords as identifiers (assign a common identifier token), and to find out which identifiers are “keywords” in some part of your C / C ++ code.
In other words. Ragel will only detect identifiers. "myvar", "return" and "returns" will be marked as "identifiers". Later, in your semantic action code ( C / C ++ is not Ragel ), you will check each identifier and determine if this is a keyword in C / C ++. This is usually done using a list of keywords.
I think it will be something like this:
%%{ Identifier = (alpha | '_') . (alnum | '_')*; action IdentifierAction { String Keywords[] = ( "return", "if", "else" ); String MyIdentifier = te - ts; if (SearchKeywordCode(Keywords, MyIdentifier)) { std::cout << "keyword(\""; std::cout.write(ts, te - ts); std::cout << "\")"; } else { std::cout << "identifier(\""; std::cout.write(ts, te - ts); std::cout << "\")"; } } }%%
So, there is no "Return" or "Return" rule, just an "Identifier".
source share