Simple XML parser in bison / flex

I would like to create a simple XML parser using bison / flex. I don't need checks, comments, arguments, only <tag>value</tag>where the value can be a number, a string, or another <tag>value</tag>.

So for example:

<div>
  <mul>
    <num>20</num>
    <add>
      <num>1</num>
      <num>5</num>
    </add>
  </mul>
  <id>test</id>
</div>

If this helps, I know the names of all the tags that may occur. I know how many subtags can be held by this tag. Is it possible to create a parser parser that would do something like this:

- new Tag("num", 1)           // tag1
- new Tag("num", 5)           // tag2
- new Tag("add", tag1, tag2)  // tag3
- new Tag("num", 20)          // tag4
- new Tag("mul", tag4, tag3)
...
- root = top_tag

Tag and number of subtags:

  • num: 1 (value only)
  • str: 1 (value only)
  • add | sub | mul | div: 2 (num | str | tag, num | str | tag)

Could you help me with the grammar in order to be able to create AST as described above?

+3
2

, , yax . README:

yax - YACC ( Gnu Bison) / XML-.

, XML XML-.

yylex() Bison XML-.

Bison, , , .

  • XML,
  • XML- ,
  • DOM.
+4

, xml. , .

Flex : NUM . STR , '<' ' > '. STOP . START.

<\?.*\?> { ;} 
<[a-z]+> { return START; }
</[a-z]+> { return STOP; }
[0-9]+ { return NUM; }
[^><]+ { return STR; }

Bison

%token START, STOP, STR, NUM
%%
simple_xml : START value STOP
;
value : simple_xml 
| STR
| NUM
| value simple_xml
;
+1

Source: https://habr.com/ru/post/1751834/


All Articles