Call __new__ when subclassing a tuple

In Python, when a tuple is subclassed, the __new__ function __new__ called with self as an argument. For example, here's a paraphrased version of the PySpark Row class:

 class Row(tuple): def __new__(self, args): return tuple.__new__(self, args) 

But help(tuple) does not show the self argument for __new__ :

  __new__(*args, **kwargs) from builtins.type Create and return a new object. See help(type) for accurate signature. 

and help(type) just says the same thing:

 __new__(*args, **kwargs) Create and return a new object. See help(type) for accurate signature. 

So, how is self passed __new__ in the definition of the Row class?

  • Is it through *args ?
  • Does __new__ any subtlety where its signature can change with context?
  • Or, a mistake in the documentation?

Can I view the source of tuple.__new__ so that I can see the answer for myself?

My question is not a duplicate of this , because in this question all discussions relate to __new__ methods, which explicitly have self or cls as the first argument. I try to understand

  • Why the tuple.__new__ does not have self or cls as the first argument.
  • How can I examine the source code of the tuple class to see if it really happens.
+5
source share
1 answer

The correct signature is tuple.__new__

Functions and types implemented in C often cannot be checked, and their signature will always look like this.

The correct tuple.__new__ is:

 __new__(cls[, sequence]) 

For instance:

 >>> tuple.__new__(tuple) () >>> tuple.__new__(tuple, [1, 2, 3]) (1, 2, 3) 

Unsurprisingly, this is like calling tuple() , except that you need to repeat tuple twice.


The first argument is __new__

Note that the first argument to __new__ always the class, not the instance. Actually, the role of __new__ is to create and return a new instance.

The special __new__ method is a static method.

I say this because in your Row.__new__ I see self : while the argument name does not matter (except when using keyword arguments), beware that self will be a Row or a subclass of Row , not an instance. The general convention is the name of the first cls argument instead of self .


Back to your questions

So, how is self passed to __new__ in the definition of the Row class?

When you call Row(...) , Python automatically calls Row.__new__(Row, ...) .

  • Is it through *args ?

You can write Row.__new__ as follows:

 class Row(tuple): def __new__(*args, **kwargs): return tuple.__new__(*args, **kwargs) 

It works, and there is nothing wrong with that. This is very useful if you do not need arguments.

  • Does __new__ any subtlety where its signature can change with context?

No, the only feature of __new__ is that it is a static method.

  • Or, a mistake in the documentation?

I would say that it is incomplete or ambiguous.

  • Why the tuple.__new__ does not have self or cls as the first argument.

It has, it just does not map to help(tuple.__new__) , because often this information is not displayed by functions and methods implemented in C.

  • How can I examine the source code for the tuple class to see if it really happens.

The file you are looking for is Objects/tupleobject.c . In particular, you are interested in the tuple_new() function:

 static char *kwlist[] = {"sequence", 0}; /* ... */ if (!PyArg_ParseTupleAndKeywords(args, kwds, "|O:tuple", kwlist, &arg)) 

Here "|O:tuple" means: the function is called a "tuple" and takes one optional argument ( | restricts optional arguments, O denotes a Python object). An optional argument can be specified using the sequence keyword.


About help(type)

For reference, you looked at the type.__new__ , while you were supposed to stop in the first four lines of help(type) :

In the case of __new__() correct signature is a type() signature:

 class type(object) | type(object_or_name, bases, dict) | type(object) -> the object type | type(name, bases, dict) -> a new type 

But this does not matter, since tuple.__new__ has a different signature.


Remember super() !

Last but not least, try using super() instead of directly calling tuple.__new__() .

+11
source

Source: https://habr.com/ru/post/1239211/


All Articles