As the question has some upvotes (although this is somewhat duplicate), I will answer here my original questions (thanks above):
- Yes, python checks all the contents of the internal table: but only for some rows, mainly those that can also be used as identifiers. The idea is that the acceleration trick used to handle the identifier by the python interpreter (compiler?) Is also useful for general string processing. This process is called internment.
- As far as I know, there are no restrictions on string size, but there are other rules for reusing strings (basically: they should look like python identifiers).
- Yes, the table is just python python, and rows have a hash to search for.
- It is used only for string literals and constant expressions. Basically for all the things that the python interpreter can output at compile time.
To clarify the last point, the following snippets evaluate the string 'xxx' in all cases, but they are treated differently with respect to internment.
This is a constant expression:
'x' * 3
But this is not so:
a = 'x' a * 3 # this is no constant expression, so no interning can be applied.
And this is not an expression:
''.join(['x', 'x', 'x'])
source share