Parsing and accessing untrusted XML

I have some kind of XML conversion gateway, and it accepts XML in one format and gives it to another from sources that I probably would not trust. Now these transformations can be either trivial, as in changing the pair of attrs here and there, or quite complex when I need to parse all the input and build the output from scratch. So basically I have two problems:

  • XML parsing. It should be fast (preferably) and work without blowing up a table of atoms (I look at you, xmerl), since the sources are not so reliable.

  • Easy access to deeply nested elements to get the information needed for recovery.

Although there are several options for parsing XML, such as fast_xmland libraries erlsom, they generate structures that are pretty hard to access because they are not comparable to xmerl_xpathand so far the only reasonable way I've found that they are deeply nested data.

So, the question is, is there a way to achieve these goals without spending a lot of time creating your own solution?

PS Jokes aside? Trying to close this question? I am not asking which library to use out of the 100 available, I am asking how to solve a problem that can occur for most people who choose to use Erlang to process XML.

+4
source share
1 answer

, :

1) Erlsom Fast XML Erlang XML , : {"tag", [{"attr", "value"}], ["text node"]}, , , .

2) , , 50 LOC

3) XML-, xmerl - . , , , cdata -, , , . ?!

0

Source: https://habr.com/ru/post/1683967/


All Articles