I write a text parser in a programming language, out of curiosity. Let's say I want to define an immutable (at run time) token plot as vertices / nodes. Of course, they are of various types - some tokens are keywords, some are identifiers, etc. However, they all have a common feature, where each token in the chart points to another. This property allows the parser to find out what can follow a specific token, and therefore the chart determines the formal grammar of the language. My problem is that I stopped using C ++ on a daily basis several years ago, and since then I have used many higher-level languages, and my head is completely fragmented regarding heap allocation, stack allocation, etc. Alas, my C ++ is rusty.
However, I would like to immediately climb a steep hill and set a goal to determine this schedule in this imperative language in the most perfect way. For example, I want to avoid allocating each token object separately on the heap, using the βnewβ one, because I think that if I allocated the entire graph of these tokens in a way that says (linearly, like the elements in the array) this could benefit performance somehow based on the reference principle - I mean, when the whole graph is compacted to occupy the minimum space along the "line" in memory, and not to have all its token objects in random places, which is a plus? In any case, as you can see, this is a very open question.
class token
{
}
class word: token
{
const char* chars;
word(const char* s): chars(s)
{
}
}
class ident: token
{
}
template<int N> class composite_token: token
{
token tokens[N];
}
class graph
{
token* p_root_token;
}
: ? , , .. - ? , ... ( , .) . , ++, C. .