Will rewriting a multi-user file parser to use formal grammars improve maintainability?

TL; DR: if I built a multi-purpose parser manually with different code for each format, will it work better ultimately using a single piece of analyzer code and ANTLR, PyParsing, or a similar grammar to indicate each format?

Context: My work includes many test log files with ~ 50 different tests. In XML, there are several, several HTML, several CSV and many proprietary materials without a documentary specification. To save me and my employees the time of entering this data manually, I wrote a parsing tool that processes all the formats that we deal with regularly with a uniform interface. The design, however, is not so clean.

I wrote this thing in Python and created the Parser class. Each file format is treated as an implementation that provides its own code for the Parser read () method. I like the idea of ​​having only one Parser definition that uses grammar to understand each format, but I have never done this before.

Is my time worth it and it will be easier for other newbies to work in the future as soon as I finish refactoring?

+3
source share
1 answer

I cannot answer your question with 100% certainty, but I can give you an opinion.

I find that choosing the right grammar and manual parser reger often comes down to how uniform the input is.

If the input is very uniform and you already know a language that is well-versed in strings, like Python or Perl, I would preserve the existing code.

, , , Antlr, , . , , , , .

, , , Antlr vs regexs. , , , Antlr , .

, , . , - , , .

+3

Source: https://habr.com/ru/post/1731251/


All Articles