When the CPython REPL executes the string, it will:
- parse and compile it to the bytecode code object, and then
- execute bytecode.
The compilation result can be checked using dis module :
>>> dis.dis('a = [1234, 1234, 5678, 90123, 5678, 4321]') 1 0 LOAD_CONST 0 (1234) 2 LOAD_CONST 0 (1234) 4 LOAD_CONST 1 (5678) 6 LOAD_CONST 2 (90123) 8 LOAD_CONST 1 (5678) 10 LOAD_CONST 3 (4321) 12 BUILD_LIST 6 14 STORE_NAME 0 (a) 16 LOAD_CONST 4 (None) 18 RETURN_VALUE
Note that all 1234s are loaded with " LOAD_CONST 0 ", and all 5678s are loaded with " LOAD_CONST 1 ". They relate to the constant table associated with the code object. Here's the table (1234, 5678, 90123, 4321, None) .
The compiler knows that all 1234 copies in the code object are the same , so it will only select one object for all of them.
Therefore, as OP observed, a[0] and a[1] really refer to the same object: the same constant from the table of constants of the code object of this line of code.
When you execute b = 1234 , it will be compiled and executed again, independent of the previous line, so another object will be selected.
(You can read http://akaptur.com/blog/categories/python-internals/ for a brief introduction on interpreting code objects)
Outside of the REPL, when you execute the *.py file, each function compiles into separate code objects, so when you run it:
a = [1234, 1234] b = 1234 print(id(a[0]), id(a[1])) print(id(b)) a = (lambda: [1234, 1234])() b = (lambda: 1234)() print(id(a[0]), id(a[1])) print(id(b))
We can see something like:
4415536880 4415536880 4415536880 4415536912 4415536912 4415537104
- The first three numbers have the same address 4415536880 and belong to the constants of the code object "__main__"
- Then
a[0] and a[1] have the addresses 4415536912 of the first lambda. b has the address 4415537104 second lambda.
Also note that this result is valid only for CPython. Other implementations have different constant extraction strategies. For example, running the above code in PyPy gives:
19745 19745 19745 19745 19745 19745