I wrote a rather complicated parser for a stack-based language that loads a file into memory, and then goes by comparing tokens to see if it is recognized as an operand or instruction.
Every time I have to parse a new operand / instruction, I std::copy memory from the file buffer to std::string , and then do `
if(parsed_string.compare("add") == 0) { } else if(parsed_string.compare("sub") == 0) { } else { }
Unfortunately, all of these copies slow down parsing.
How should I handle this to avoid all of these copies? I always thought that I did not need a tokenizer, since the language itself and the logic are pretty simple.
Edit: I am adding code where I get copies for various operands and instructions
// This function accounts for 70% of the total time of the program std::string Parser::read_as_string(size_t start, size_t end) { std::vector<char> file_memory(end - start); read_range(start, end - start, file_memory); std::string result(file_memory.data(), file_memory.size()); return std::move(result); // Intended to be consumed } void Parser::read_range(size_t start, size_t size, std::string& destination) { if (destination.size() < size) destination.resize(size); // Allocate necessary space std::copy(file_in_memory.begin() + start, file_in_memory.begin() + start + size, destination.begin()); }
source share