Good idea. You accept one or more of:
a) that each tool that has a grammar, uses a canonical parsing engine type (eg, everybody uses bison) b) that there is some parsing tool that understands the zillion grammar specification schemes that exist c) that whatever the parser is, it will parse language fragments (perhaps well formed).
a) is clearly false. I have never seen b). Virtually none of the syntax machines work c); they can only analyze "complete programs."
Your only hope IMHO is to use a parser generator that has a large number of well-tested language definitions.
ANTLR , possibly one; it certainly has a long list of language definitions provided. And they are all in one place. I do not know the language fragments that I know of. It is doubtful if it has XML export for all parsing trees.
Bison is perhaps one; There are many, many language processors built using Bison. But definitions are scattered everywhere, and it will be very difficult to collect them. Also, fragments of the language do not occur. Pretty sure it has no XML export.
Our DMS Software Reengineering Toolkit is perhaps one. It has many definitions of the language. All of them are collected in one place (our company). It produces an AST for each analysis and has built-in XML export. DMS can also analyze any language that is not a term for any language that it knows.
DMS can very well mimic your example, given DMS.lex, .atg ("attribute grammar") and a compatible source file.
This is followed by the DMS lexer / parser-build and starts with XML export for the Algebra grammar found in Algebra as a DMS domain ( ++ XML halfway, for example, is the parsing step proposed for exporting XML):
C:\DMS\Domains\Algebra\Tools\Parser\Source>make perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -lexer MakeDMSTool: Selected domain "Algebra". LexerGenerator V2.1a Copyright (c) 1999-2010 Semantic Designs, Inc.; All Rights Reserved Parsing lexical specification ... Processing mode Algebra ... Exiting with final status 0 perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -tool %Temporaries MakeDMSTool: Selected domain "Algebra". Using attribute grammar in "/cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source/Syntax/Algebra.atg" AttributeEvaluatorGenerator V3.0 Copyright (c) 1999-2010 Semantic Designs, Inc.; All Rights Reserved Parsing attribute grammar ... Generating attribute evaluator(s) ... Exiting with final status 0 rm -rf /cygdrive/c/DMS/Domains/Algebra/Tools/%Temporaries perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -prettyprinter MakeDMSTool: Selected domain "Algebra". PrettyPrinterGenerator V2.0 Copyright (c) 1999-2010 Semantic Designs, Inc.; All Rights Reserved Parsing pretty printer specification ... Generating pretty printer ... Exiting with final status 0 AttributeEvaluatorGenerator V3.0 Copyright (c) 1999-2010 Semantic Designs, Inc.; All Rights Reserved Parsing attribute grammar ... Generating attribute evaluator(s) ... ...................... Exiting with final status 0 cd /cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source/\%Generated; \ perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -weave-preserve-productions %PreserveProductions.*.par MakeDMSTool: Selected domain "Algebra". perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -parser MakeDMSTool: Selected domain "Algebra". export PARLANSEINCLUDEDIRECTORIES=`perl -e '($_ = $ARGV[0].";/cygdrive/c/DMS/Domains/PARLANSE/Library/Arrays;/cygdrive/c/DMS/Domains /PARLANSE/Library/Bags;/cygdrive/c/DMS/Domains/PARLANSE/Library/HashTables;/cygdrive/c/DMS/Domains/PARLANSE/Library/Pipes;/cygdrive/ c/DMS/Domains/PARLANSE/Library/Sequences;/cygdrive/c/DMS/Domains/PARLANSE/Library/Sets;/cygdrive/c/DMS/Domains/PARLANSE/Library/Stac ks;/cygdrive/c/DMS/Domains/PARLANSE/Library/Utilities;/cygdrive/c/DMS/Domains/PARLANSE/Library/Algorithms/Source;/cygdrive/c/DMS/Dom ains/PARLANSE/Library/Booleans/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/Characters/Source;/cygdrive/c/DMS/Domains/PARLANSE/Li brary/Graphics/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/HashTrees/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/Numbers/Sou rce;/cygdrive/c/DMS/Domains/PARLANSE/Library/References/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/SQL/Source;/cygdrive/c/DMS/D omains/PARLANSE/Library/Streams/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/SuffixTrees/Source;/cygdrive/c/DMS/Domains/PARLANSE/ Library/System/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/Search/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/TestSupport/So urce") =~ s!//(.)/!$1:/!g; $_ =~ s!/cygdrive/(.)/!$1:/!g; print $_' "/cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source;/cygdrive/c /DMS/Domains/Algebra/Tools/Parser/Source/Components;/cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source/%Generated;/cygdrive/c/DMS/D omains/DMSStringGrammar/Tools/DomainParser/Source;/cygdrive/c/DMS/Domains/Algebra/Tools/Lexer/Source;/cygdrive/c/DMS/Domains/Algebra /Tools/Lexer/Source/%Generated;/cygdrive/c/DMS/Domains/DMSLexical/Tools/DomainLexer/Source;/cygdrive/c/DMS/Infrastructure/HyperGraph /Source;/cygdrive/c/DMS/Domains"`; \ cd `echo /cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source`; \ nice /cygdrive/c/DMS/Domains/PARLANSE/Tools/Compiler/p0c.exe DomainParser.par PARLANSE0 Compiler V19.16.40 Semantic Designs, Inc. *** Confidential Information 128/485/133408 smallest/average/largest activation record/grain stack space required. Largest stack space required by function at Line 1533 in file FFIModule.par 89 grains. 3775 functions/procedures. 223447 lines of source code read. 7160772 bytes of object code. No errors detected. mv -f /cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source/DomainParser.P0B /cygdrive/c/DMS/Domains/Algebra/Tools/Parser/DomainParser .P0B C:\DMS\Domains\Algebra\Tools\Parser\Source>run ../DomainParser ++XML C:\DMS\Domains\Algebra\Tools\Lexer\TestCase\algebraformula.txt Domain Parser for Algebra 2.3.3 Copyright (C) Semantic Designs 1996-2010; All Rights Reserved 31 tree nodes in tree. <DMSForest> <tree node="formula" type="1" domain="1" id="10qx0" parents="0" line="1" column="1" file="1"> <tree node="product" type="4" domain="1" id="10qwx" line="1" column="1" file="1"> <tree node="term" type="10" domain="1" id="10qwy" line="1" column="1" file="1"> <tree node="'D'" type="19" domain="1" id="10qw5" literal="0" line="1" column="1" file="1"/> <tree node="'['" type="20" domain="1" id="10qw6" literal="0" line="1" column="2" file="1"/> <tree node="formula" type="1" domain="1" id="10qwt" line="1" column="4" file="1"> <tree node="product" type="4" domain="1" id="10qws" line="1" column="4" file="1"> <tree node="term" type="9" domain="1" id="10qwr" line="1" column="4" file="1"> <tree node="'('" type="17" domain="1" id="10qw7" literal="0" line="1" column="4" file="1"/> <tree node="formula" type="3" domain="1" id="10qwp" line="1" column="5" file="1"> <tree node="formula" type="2" domain="1" id="10qwk" line="1" column="5" file="1"> <tree node="formula" type="1" domain="1" id="10qwf" line="1" column="5" file="1"> <tree node="product" type="5" domain="1" id="10qwe" line="1" column="5" file="1"> <tree node="product" type="4" domain="1" id="10qwa" line="1" column="5" file="1"> <tree node="term" type="7" domain="1" id="10qw9" line="1" column="5" file="1"> <tree node="VARIABLE" type="15" domain="1" id="10qw8" line="1" column="5" file="1"> <literal>x</literal> </tree> </tree> </tree> <tree node="'*'" type="13" domain="1" id="10qwb" literal="0" line="1" column="7" file="1"/> <tree node="term" type="8" domain="1" id="10qwd" line="1" column="8" file="1"> <tree node="NUMBER" type="16" domain="1" id="10qwc" literal="23" line="1" column="8" file="1"/> </tree> </tree> </tree> <tree node="'+'" type="11" domain="1" id="10qwg" literal="0" line="1" column="10" file="1"/> <tree node="product" type="4" domain="1" id="10qwj" line="1" column="12" file="1"> <tree node="term" type="7" domain="1" id="10qwi" line="1" column="12" file="1"> <tree node="VARIABLE" type="15" domain="1" id="10qwh" line="1" column="12" file="1"> <literal>y</literal> </tree> </tree> </tree> </tree> <tree node="'-'" type="12" domain="1" id="10qwl" literal="0" line="1" column="13" file="1"/> <tree node="product" type="4" domain="1" id="10qwo" line="1" column="14" file="1"> <tree node="term" type="7" domain="1" id="10qwn" line="1" column="14" file="1"> <tree node="VARIABLE" type="15" domain="1" id="10qwm" line="1" column="14" file="1"> <literal>z</literal> </tree> </tree> </tree> </tree> <tree node="')'" type="18" domain="1" id="10qwq" literal="0" line="1" column="15" file="1"/> </tree> </tree> </tree> <tree node="','" type="21" domain="1" id="10qwu" literal="0" line="1" column="16" file="1"/> <tree node="VARIABLE" type="15" domain="1" id="10qwv" line="1" column="18" file="1"> <literal>x</literal> </tree> <tree node="']'" type="22" domain="1" id="10qww" literal="0" line="1" column="19" file="1"/> </tree> </tree> </tree> <FileIndex> <File index="1">C:/DMS/Domains/Algebra/Tools/Lexer/TestCase/algebraformula.txt</File> </FileIndex> <DomainIndex> <Domain index="1">Algebra</Domain> </DomainIndex> </DMSForest> Exiting with final status 0 C:\DMS\Domains\Algebra\Tools\Parser\Source>
If you really need an engine that understands many grammar notations, the easiest way is to create one with DMS. Simply define each of the grammar formalisms (e.g. ANTLR or bison) as DSL for DMS, parse a specific instance of the grammar formalism (e.g., bnf ANLTR) using DMS, apply the DMS rewrite rules to convert it to DMS grammar, and then build the parser DMS (You will also have to do the same with the lexer.).