What is a good tool for automatically calculating FIRST and FOLLOW sets?

I'm currently in the middle of a game with BNF grammar, which I hope will be able to argue in LL (1) form. However, I just finished making changes and calculating the new FIRST and FOLLOW grammar sets manually for the third time today, and I get tired of it. There must be a better way!

Can someone suggest a tool that, given the grammar, will automatically calculate the first and subsequent sets for all non-terminals?

+4
source share
3 answers

A year ago, we had a semester project at the university, in which I participated, where our task was to create a programming language. As a group, we decided that we wanted to be able to manually write the parser from scratch, so we had to strive for LL (1) grammar, since it would be completely unrealistic to write a parser otherwise.

Of course, our starting point was far from LL (1), so we also had to argue. For this purpose, we used the kfgEdit tool from the AtoCC package. All you do is enter your own rules and then check if this is LL (1) grammar with the click of a button.

A bright word of warning: the instrument is a little stunned by what it takes. Although you often use EBNF for real grammar, so you can write? and * and +, to signal how many times this token should appear there, this is not supported. Grouping is also not supported. You can very well understand that this takes a very long time, and you will almost certainly want to do some “rearrangement” after you reach LL (1) to make the grammar even close to readable.

Of course, depending on the type of grammar you are dealing with, this may not be a big problem for you. We created a kind of Pascal / C hybrid with a rather limited set of constructs (procedures, functions, only built-in primitive types and arrays of them, ifs, a single loop construct, which we came up with instead of standard 3 ...), and it took me at least a week to impose it in LL (1) grammar - probably 2, actually. Please note that this is about 4 months, so a lot of time has been spent.

If you absolutely SHOULD have LL (1) grammar, you obviously have to click if you get into this situation, but if you are allowed to use parser generators like yacc / bison or SableCC, then you will end up most likely it will be much easier to go down this route. This does not mean that you SHOULD go this way - I found that in fact everything written by hand gave some information that I probably would not have received otherwise - but you might be better off getting this information in a different situation than your current.

tl; dr version: Use kfgEdit from the AtoCC package.

+3
source

For a recursive descent parsing, it's worth a look at ANTLR . However, I'm not sure if it gives an exact answer to your question - find FIRST and FOLLOW for a given grammar.

0
source

DMS Software Reengineering Toolkit contains a parser generator that calculates FIRST and FOLLOW; it will also allow you to inspect the state machine L (AL) R that it generates.

However, if you have a legitimate context-free grammar, you do not need to “bicker it” into LL form; DMS parser generator produces GLR parsers from any context-free grammar.

0
source

Source: https://habr.com/ru/post/1301229/


All Articles