We saw a clean standard library approach .
These are execute d instructions immediately.
Create a parser that creates an AST (abstract syntax tree). In the case of our simple stacking machine, this is just a list of instructions. Let's call him Tape .
Using Boost Spirit
I would advise against lexer. Lexers are supported in Spirit v2 (and not in X3 - yet?). But in practice, they complicate the situation, and the Spirit knows how to cancel the entry in case of non-compliance. Thus, you can just tentatively approach the productions and try the following, if this is not the correct "token".
Here, where using the grammar of the Spirit should look like this:
Tape program; boost::spirit::istream_iterator f(std::cin >> std::noskipws), l; // Note the noskipws! if (parse(f, l, Parser::program, program)) { std::cout << "Parsed " << program.size() << " instructions\n"; } else { std::cout << "Parse failed\n"; }
Now AST types:
struct Add {}; struct Sub {}; struct Mul {}; struct Div {}; struct Pop {}; using Value = int; using Instr = boost::variant<Add, Sub, Mul, Div, Pop, Value>; using Tape = std::vector<Instr>;
Simple, correct.
Grammar
In X3, making grammar pretty easy. Top down:
auto instr = opcode_ | int_; auto program = skip(space) [*instr];
Now all we need to do is teach him to recognize the operation codes. The beginning would be:
struct opcodes : symbols<Instr> { opcodes() { this->add("add", Add{})("sub", Sub{})("mul", Mul{})("div", Div{})("pop", Pop{}); } } opcode_;
An experienced spirit guru will find a problem here: opcode_ not a token, and it does not guarantee parsing of a "different identifier". For instance. "a dd" will match Add . And "additional" will also match.
Fortunately, the X3 makes it easy to make directives on the fly:
auto opcode_ = [] { struct opcodes : symbols<Instr> { opcodes() { this->add("add", Add{})("sub", Sub{})("mul", Mul{})("div", Div{})("pop", Pop{}); } } codes_; return lexeme[codes_ >> !graph]; }();
So now both holes are fixed.
Full demo
Live on coliru
#include <iostream> #include <deque> #include <boost/spirit/home/x3.hpp> #include <boost/spirit/include/support_istream_iterator.hpp> struct Add {}; struct Sub {}; struct Mul {}; struct Div {}; struct Pop {}; using Value = int; using Instr = boost::variant<Add, Sub, Mul, Div, Pop, Value>; struct Machine { using result_type = void; std::deque<Value> stack_; void operator()(Instr instr) { boost::apply_visitor(*this, instr); } void operator()(Add) { assert(stack_.size()>=2); auto op2 = stack_.back(); stack_.pop_back(); auto op1 = stack_.back(); stack_.pop_back(); stack_.push_back(op1 + op2); } void operator()(Sub) { assert(stack_.size()>=2); auto op2 = stack_.back(); stack_.pop_back(); auto op1 = stack_.back(); stack_.pop_back(); stack_.push_back(op1 - op2); } void operator()(Mul) { assert(stack_.size()>=2); auto op2 = stack_.back(); stack_.pop_back(); auto op1 = stack_.back(); stack_.pop_back(); stack_.push_back(op1 * op2); } void operator()(Div) { assert(stack_.size()>=2); auto op2 = stack_.back(); stack_.pop_back(); auto op1 = stack_.back(); stack_.pop_back(); assert(op2 != 0); stack_.push_back(op1 / op2); } void operator()(Value v) { stack_.push_back(v); } void operator()(Pop) { assert(stack_.size()>=1); stack_.pop_back(); } void trace() const { using namespace std; // debug trace copy(stack_.begin(), stack_.end(), ostream_iterator<Value>(cout, " ")); cout << "\n"; } }; using Tape = std::vector<Instr>; namespace Parser { using namespace boost::spirit::x3; auto opcode_ = [] { struct opcodes : symbols<Instr> { opcodes() { this->add("add", Add{})("sub", Sub{})("mul", Mul{})("div", Div{})("pop", Pop{}); } } codes_; return lexeme[codes_ >> !graph]; }(); auto instr = opcode_ | int_; // TODO auto program = skip(space) [*instr]; } int main() { Tape program; boost::spirit::istream_iterator f(std::cin >> std::noskipws), l; // Note the noskipws! if (parse(f, l, Parser::program, program)) { std::cout << "Parsed " << program.size() << " instructions\n"; } else { std::cout << "Parse failed\n"; } if (f != l) std::cout << "Unparsed program data: '" << std::string(f,l) << "'\n"; Machine machine; for (auto instr : program) { machine(instr); machine.trace(); } }
Print
Parsed 7 instructions 1 1 2 3 3 3 3 3 1 3 2 6
Summary
The main gain here:
- we declaratively define our grammar. For a more complex grammar, this can be a huge advantage.
we get a free retreat - so no need to split tokens ahead of time
Note. This is a PEG grammar. Rollback only comes down to trying to use the next alternative of the sibling or a failure in the current rule (therefore, the parent rule can try the next alternative of the sister).
This is significantly different from backtracking, as in regular expression. You will notice a difference with other expressions of the Kleene- * en parser. In PEG grammars, they are always greedy and never return only one element (similar to the "Maximum Munch" rules).
We no longer have a messy switch . It is actually hidden inside the Variant Visitor (see apply_visitor ).
The execution of the instruction is practically unmodified, but we renamed execute to operator() so that it simulates the concept of Visitor.