This is because tuple objects (and I'm sure all containers except the string) evaluate their size not , including the actual sizes of their respective contents, but rather by calculating the size of the pointers to the PyObject times the elements they contain. That is, they contain pointers to a (common) PyObject and contain what contributes to its overall size.
This is outlined in the Data Model chapter of the Python Reference:
Some objects contain links to other objects; they are called containers. Examples of containers are tuples, lists, and dictionaries. Links are part of the value of the container.
(I emphasize word references.)
In PyTupleType , a structure that contains information about the tuple type, we see that the tp_itemsize field has sizeof(PyObject *) as its value:
PyTypeObject PyTuple_Type = { PyVarObject_HEAD_INIT(&PyType_Type, 0) "tuple", sizeof(PyTupleObject) - sizeof(PyObject *), sizeof(PyObject *),
32 bit assemblies and 64 bit Python assemblies have sizeof(PyObject *) equal to 8 bytes.
This is the value that will be multiplied by the number of elements contained in the tuple instance. When we look at object_size , the __sizeof__ method that tuple inherits from object (check object.__sizeof__ is tuple.__sizeof__ ), we can clearly see this:
static PyObject * object_sizeof(PyObject *self, PyObject *args) { Py_ssize_t res, isize; res = 0; isize = self->ob_type->tp_itemsize; if (isize > 0) res = Py_SIZE(self) * isize;
see how isize (derived from tp_itemsize ) is multiplied by Py_SIZE(self) , which is another macro that captures the ob_size value indicating the number of elements inside the tuple .
Thatβs why, even if we create a slightly larger row inside the tuple instance:
t = ("Hello" * 2 ** 10,)
with an element inside it having a size:
t[0].__sizeof__() # 5169
tuple instance size:
t.__sizeof__()
equals one with a simple "Hello" inside:
t2 = ("Hello",) t[0].__sizeof__()
For strings, each individual character increments the value returned from str.__sizeof__ . This, along with the fact that tuple stores only pointers, gives the false impression that "Hello" is larger than the tuple containing it.
Just for completeness, unicode__sizeof__ is the one that computes this. It really just multiplies the length of the string with the size of the character (which depends on which character has character 1 , 2 and 4 ).
The only thing I do not get with tuples is why the main size (indicated by tb_basicsize ) is specified as sizeof(PyTupleObject) - sizeof(PyObject *) . This discards bytes 8 of the total return size; I have not found an explanation for this (yet).