What are the differences between cpdef and cdef wrapped in def?

Question

What are the differences between cpdef and cdef wrapped in def?

There is an example in Cython docs where they give two ways to write a hybrid C / Python method. Explicit with cdef for C quick access and def wrapper for access with Python:

cdef class Rectangle: cdef int x0, y0 cdef int x1, y1 def __init__(self, int x0, int y0, int x1, int y1): self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1 cdef int _area(self): cdef int area area = (self.x1 - self.x0) * (self.y1 - self.y0) if area < 0: area = -area return area def area(self): return self._area()

And one using cpdef:

 cdef class Rectangle: cdef int x0, y0 cdef int x1, y1 def __init__(self, int x0, int y0, int x1, int y1): self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1 cpdef int area(self): cdef int area area = (self.x1 - self.x0) * (self.y1 - self.y0) if area < 0: area = -area return area

I was wondering what the differences are in practical terms.

For example, is the method faster / slower when called from C / Python?

Also, if a subclass / override does cpdef offer something that is missing from another method?

+5

python cython

Paul panzer Feb 19 '18 at 10:57

source share

2 answers

See docs here - for most purposes they are almost the same, cpdef has a bit more overhead, but plays better with inheritance.

The cpdef directive provides two versions of the method; one fast for use with Cython and one slower for use with Python. Then:
This is a little more than providing a python shell for the cdef method: unlike the cdef method, the cpdef method is completely overridden by the methods and instance attributes in Python subclasses. It adds a bit of overhead compared to the cdef Method.

+1

chrisb Feb 19 '18 at 14:59

source share

ead · Accepted Answer · 2018-02-20T06:21:02+0000

chrisb answer gives you everything you need to know, but if you play the gory details ...

But, firstly, excerpts from a long analysis in a nutshell:

For free functions, there is not much difference between cpdef and deployment with cdef + def in performance. The resulting c-code is almost identical.
For related methods, cpdef -approach might be slightly faster with inheritance hierarchies, but nothing needs to be worried too much.
Using cpdef -syntax has its advantages, as the resulting code is clearer (at least for me) and shorter.

Free Functions:

When we define something stupid:

  cpdef do_nothing_cp(): pass

the following happens:

a quick c function is created (in this case it has the cryptic name __pyx_f_3foo_do_nothing_cp because my extension is called foo , but you really only need to look for the f prefix).
a python function is also created (called __pyx_pf_3foo_2do_nothing_cp - the pf prefix), it does not duplicate the code and does not call a fast function somewhere in the path.
a python shell is created called __pyx_pw_3foo_3do_nothing_cp ( pw prefix)
do_nothing_cp a method definition is issued, this is what is needed for the python shell, and this is the place where the function is stored that should be called when foo.do_nothing_cp called.

Here you can see it in the generated c-code:

  static PyMethodDef __pyx_methods[] = { {"do_nothing_cp", (PyCFunction)__pyx_pw_3foo_3do_nothing_cp, METH_NOARGS, 0}, {0, 0, 0, 0} };

For the cdef function, only the first step is performed; for the def function, only steps 2-4 are performed.

Now, when we load the foo module and call foo.do_nothing_cp() , the following happens:

The function pointer associated with the name do_nothing_cp is found in our case, the python-wrapper pw function.
pw function is called through a pointer function and calls the pf function (as C functionality).
pf function causes a fast f function.

What happens if we call do_nothing_cp inside a cython module?

 def call_do_nothing_cp(): do_nothing_cp()

Obviously, cython does not need a python mechanism to define a function in this case - it can directly use the fast f function through a call to the c function, bypassing the pw and pf functions.

What happens if we end the cdef function in a def function?

 cdef _do_nothing(): pass def do_nothing(): _do_nothing()

Cython does the following:

a fast _do_nothing function is created corresponding to the function f above.
a pf function for do_nothing that calls _do_nothing somewhere in the path.
python-wrapper function is created, i.e. pw that wraps the pf function
the functionality is tied to foo.do_nothing with a function pointer to the python-wrapper pw function.

As you can see, there is not much difference with cpdef approach.

cdef functions are just a c function, but the def and cpdef are python functions of the first class - you can do something like this:

 foo.do_nothing=foo.do_nothing_cp

In terms of performance, we cannot expect much difference here:

 >>> import foo >>> %timeit foo.do_nothing_cp 51.6 ns ± 0.437 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) >>> %timeit foo.do_nothing 51.8 ns ± 0.369 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

If we look at the resulting machine code ( objdump -d foo.so ), we will see that the C compiler has entered all the calls for the cpdef version of do_nothing_cp :

  0000000000001340 <__pyx_pw_3foo_3do_nothing_cp>: 1340: 48 8b 05 91 1c 20 00 mov 0x201c91(%rip),%rax 1347: 48 83 00 01 addq $0x1,(%rax) 134b: c3 retq 134c: 0f 1f 40 00 nopl 0x0(%rax)

but not for expanded do_nothing (I have to admit, I'm a little surprised and still do not understand the reasons):

 0000000000001380 <__pyx_pw_3foo_1do_nothing>: 1380: 53 push %rbx 1381: 48 8b 1d 50 1c 20 00 mov 0x201c50(%rip),%rbx # 202fd8 <_DYNAMIC+0x208> 1388: 48 8b 13 mov (%rbx),%rdx 138b: 48 85 d2 test %rdx,%rdx 138e: 75 0d jne 139d <__pyx_pw_3foo_1do_nothing+0x1d> 1390: 48 8b 43 08 mov 0x8(%rbx),%rax 1394: 48 89 df mov %rbx,%rdi 1397: ff 50 30 callq *0x30(%rax) 139a: 48 8b 13 mov (%rbx),%rdx 139d: 48 83 c2 01 add $0x1,%rdx 13a1: 48 89 d8 mov %rbx,%rax 13a4: 48 89 13 mov %rdx,(%rbx) 13a7: 5b pop %rbx 13a8: c3 retq 13a9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)

This may explain why the cpdef version is a bit faster, but in any case, the difference has nothing to do with the overhead of calling a python function.

<strong> Method Class:

The situation is a bit more complicated for class methods due to possible polymorphism. Let's start with:

 cdef class A: cpdef do_nothing_cp(self): pass

At first glance, there is not much difference in the above case:

Fixed fast, c-only, f prefix version of function
Fixed python version (prefix pf ) that calls f function
The python wrapper ( pw prefix) wraps the pf version and is used for registration.
do_nothing_cp registered as a class A method via tp_methods PyTypeObject .

As can be seen from the resulting c file:

 static PyMethodDef __pyx_methods_3foo_A[] = { {"do_nothing", (PyCFunction)__pyx_pw_3foo_1A_1do_nothing_cp, METH_NOARGS, 0}, ... {0, 0, 0, 0} }; .... static PyTypeObject __pyx_type_3foo_A = { ... __pyx_methods_3foo_A, /*tp_methods*/ ... };

Obviously, the linked version should have an implicit parameter self as an additional argument, but there is something else: The function f executes the dispatch function, if it is not called from the corresponding pf function, this dispatch looks like this (I save only the important parts):

 static PyObject *__pyx_f_3foo_1A_do_nothing_cp(CYTHON_UNUSED struct __pyx_obj_3foo_A *__pyx_v_self, int __pyx_skip_dispatch) { if (unlikely(__pyx_skip_dispatch)) ;//__pyx_skip_dispatch=1 if called from pf-version /* Check if overridden in Python */ else if (look-up if function is overriden in __dict__ of the object) use the overriden function } do the work.

Why is this needed? Consider the following foo extension:

 cdef class A: cpdef do_nothing_cp(self): pass cdef class B(A): cpdef call_do_nothing(self): self.do_nothing()

What happens when we call B().call_do_nothing() ?

`B-pw-call_do_nothing 'is located and called.
it calls B-pf-call_do_nothing ,
which calls Bf-call_do_nothing ,
which calls Af-do_nothing_cp , bypassing pw and pf versions.

What happens when we add the following C class, which overrides the do_nothing_cp function?

 import foo def class C(foo.B): def do_nothing_cp(self): print("I do something!")

Now calling C().call_do_nothing() results in:

call_do_nothing' of the C -class being located and called which means, pw-call_do_nothing' of class B , which is located and called,
which calls B-pf-call_do_nothing ,
which calls Bf-call_do_nothing ,
which calls Af-do_nothing (as we already know!), bypassing pw and pf versions.

And now at stage 4. we need to send a call to Af-do_nothing() to get the correct call to C.do_nothing() ! Fortunately, we have this dispatch in this function!

Making it harder: what if class C also cdef ? Sending through __dict__ will not work because cdef-classes does not have __dict__ ?

For cdef classes, polymorphism is implemented similarly to C ++ "virtual tables", therefore in B.call_do_nothing() the f-do_nothing not called directly, but through a pointer that depends on the class of the object (you can see that these "virtual tables" are configured in __pyx_pymod_exec_XXX , e.g. __pyx_vtable_3foo_B.__pyx_base ). Thus, the __dict__ -dispatch function in the Af-do_nothing() function is not needed in the case of a pure cdef hierarchy.

As for performance, comparing cpdef with cdef + def , I get:

  cpdef def+cdef A.do_nothing 107ns 108ns B.call_nothing 109ns 116ns

so the difference is not that big if someone cpdef will be a little faster.

What are the differences between cpdef and cdef wrapped in def?

More articles: