Independent Function or Method

Question

Independent Function or Method

I need to deal with two objects of a class in such a way as to return a third object of the same class, and I am trying to determine whether it is better to do this as an independent function that receives two objects and returns a third, or as a method that takes another object and returns a third .

For a simple example. Will it be:

from collections import namedtuple class Point(namedtuple('Point', 'x y')): __slots__ = () #Attached to class def midpoint(self, otherpoint): mx = (self.x + otherpoint.x) / 2.0 my = (self.y + otherpoint.y) / 2.0 return Point(mx, my) a = Point(1.0, 2.0) b = Point(2.0, 3.0) print a.midpoint(b) #Point(x=1.5, y=2.5)

Or that:

 from collections import namedtuple class Point(namedtuple('Point', 'x y')): __slots__ = () #not attached to class #takes two point objects def midpoint(p1, p2): mx = (p1.x + p2.x) / 2.0 my = (p1.y + p2.y) / 2.0 return Point(mx, my) a = Point(1.0, 2.0) b = Point(2.0, 3.0) print midpoint(a, b) #Point(x=1.5, y=2.5)

and why is one preferable to the other?

This seems much less clear than I expected when I asked the question.

Thus, it seems that something like a.midpoint (b) is not preferable, because it seems that a special place is taken by what is actually a symmetric function that returns a completely new point instance. But this, apparently, depends heavily on taste and style between something like an autonomous module function or a function attached to a class, but not intended to be called by an insance tool such as Point.midpoint (a, b).

I think, personally, I stylistically tend to autonomous functions of the module, but this may depend on the circumstances. In cases where a function is definitely tightly coupled to a class and there is a risk of namespace contamination or potential confusion, it is likely that creating a class function makes more sense.

In addition, several people mentioned that to make a function more general, perhaps by implementing additional class functions to support this. In this particular case of points and midpoints, this is probably the general best approach. It supports polymorphism and code reuse and is well readable. In many cases, however, this did not work (the project that inspired me to ask about this, for example), but the points and midpoints seemed to be a concise and understandable example to illustrate this question.

Thanks to everyone, it was instructive.

+6

python api readability

TimothyAWiseman Oct 25 '11 at 23:13

source share

6 answers

The first approach is reasonable and not conceptually different from what set.union and set.intersection do. Any func(Point, Point) --> Point explicitly associated with the Point class, so there is no question of interfering with the unity or cohesion of the class.

It would be a tougher choice if different classes were involved: draw_perpendicular(line, point) --> line . To decide the choice of classes, you must choose the one that has the most related logic. For example, str.join needs a line separator and a list of lines. This could be a standalone function (as it was in the old days with a string module), or it could be a method in lists (but it only works for lists of strings) or a method for strings. The latter was chosen because the union is more related to strings than to lists. This choice was made, although it led to an inconvenient expression delimiter.join(things_to_join) .

I do not agree with another respondent who recommended using the class. They are often used for signatures of alternative constructors, but not for conversions in class instances. For example, datetime.fromordinal is a class for constructing a date from something other than an instance of the class (in this case, from int). This contrasts with datetime.replace, which is the usual method for creating a new datetime instance based on an existing instance. This should push you away from using the class method to calculate the midpoint.

Another thought: if you hold midpoint () with the Point () class, this allows you to create other classes that have the same Point API, but a different internal representation (that is, polar coordinates may be more convenient for some types of work than Cartesian coordinates). If midpoint () is a separate function, you begin to lose the benefits of encapsulation and a consistent interface.

+3

Raymond hettinger Oct 25 '11 at 23:19

source share

In this case, you can use operator overloading:

 from collections import namedtuple class Point(namedtuple('Point', 'x y')): __slots__ = () #Attached to class def __add__(self, otherpoint): mx = (self.x + otherpoint.x) my = (self.y + otherpoint.y) return Point(mx, my) def __div__(self, scalar): return Point(self.x/scalar, self.y/scalar) a = Point(1.0, 2.0) b = Point(2.0, 3.0) def mid(a,b): # general function return (a+b)/2 print mid(a,b)

I think that the solution mainly depends on how general and abstract the function is. If you can write a function in such a way that it works on all objects that implement a small set of clean interfaces, then you can turn it into a separate function. The more interfaces your function depends on and the more specific they are, the more it makes sense to put it in a class (since instances of this class are likely to be the only objects that the function will work with anyway).

+2

Jochen ritzel Oct 26 '11 at 0:55

source share

I would choose option one, because in this way all the functionality for points is stored in the point class, i.e. groups related functions. In addition, point objects are best aware of the meaning and inner workings of their data, which is why it is suitable for implementing your function. An external function, for example, in C ++, was supposed to be a friend that smells like a hack.

+1

hochl Oct 25 '11 at 23:19

source share

Another option is to use @classmethod . I would probably prefer in this case.

 class Point(...): @classmethod def midpoint(cls, p1, p2): mx = (p1.x + p2.x) / 2.0 my = (p1.y + p2.y) / 2.0 return cls(mx, my) # ... print Point.midpoint(a, b)

+1

codeape Oct 25 '11 at 23:19

source share

Another way to do this is to access x and y through the namedtuple indexing interface. Then you can completely generalize the midpoint function to n dimensions.

 class Point(namedtuple('Point', 'x y')): __slots__ = () def midpoint(left, right): return tuple([sum(a)/2. for a in zip(left, right)])

This construction works for Point classes, n-tuples, lists of length n, etc. For instance:

 >>> midpoint(Point(0,0), Point(1,1)) (0.5, 0.5) >>> midpoint(Point(5,1), (3, 2)) (4.0, 1.5) >>> midpoint((1,2,3), (4,5,6)) (2.5, 3.5, 4.5)

0

Nathan Oct 26 '11 at 3:29

source share

Nathan · Accepted Answer · 2011-10-25T23:39:31+0000

I would choose the second option, because, in my opinion, it is clearer than the first. You perform a midpoint operation between two points; not the midpoint relative to the point. Similarly, a natural extension of this interface could be the definition of dot , cross , magnitude , average , median , etc. Some of these features will work on Points pairs, while others may work on lists. Creating its functions makes them all consistent interfaces.

Defining it as a function also allows you to use it with any pair of objects that represent the .x .y interface, and to create it requires at least one of the two to be Point .

Finally, in order to determine the location of the function, I consider it appropriate to co-place it in the same package as the Point class. This puts it in the same namespace, which clearly indicates its relationship with Point and, in my opinion, more pythons than a static or class method.

Update: Further reading on Pythonicness @staticmethod vs package / module:

In both Thomas Wuer, the answer to the question is What is the difference between staticmethod and classmethod in Python and Mike Stader responding to init and arguments in Python , the authors pointed out that a package or module of related functions is probably the best solution. Thomas Outer should say the following:

[staticmethod] in Python is basically useless - you can just use the module function instead of staticmethod.

While Mike Stader comments:

If you find yourself creating objects that consist of nothing but staticmethods, the more pythonic will do to create a new module of related functions.

However, the code letter rightly indicates below that the Point.midpoint(a,b) calling Point.midpoint(a,b) will jointly define functionality with the type. The BDFL value also has the @staticmethod value, since the __new__ method is a staticmethod .

My personal preference would be to use a function for the reasons stated above, but it seems that the choice between @staticmethod and an autonomous function is largely dependent on the viewer.

Independent Function or Method

More articles: