Why are callbacks ugly?

Question

Why are callbacks ugly?

Recently, I have been listening to Guido van Rossum talk about asynchronous I / O in Python3. I was surprised by the concept of callbacks that developers “hated”, allegedly for ugliness. I also discovered the coroutine concept and started reading David Beasley's coroutine tutorial. So far, coroutines still look quite esoteric to me - too ordinary and tricky than those "hated" callbacks.

Now I'm trying to figure out why some people find callbacks ugly. True, with callbacks, the program no longer looks like a linear piece of code that executes a single algorithm. But, well, that’s not the case as soon as it has asynchronous I / O - and there’s nothing good in pretending. Instead, I think of a program such as event management - you write it, determining how it reacts to the relevant events.

Or something else about coroutines that are considered bad besides being non-linear?

+4

python coroutine asynchronous callback io

rincewind Aug 30 '13 at 10:42

source share

1 answer

abarnert · Accepted Answer · 2013-08-30T22:59:49+0000

Consider this code to read the protocol header:

def readn(sock, n): buf = '' while n > len(buf): newbuf = sock.recv(n - len(buf)) if not newbuf: raise something buf += newbuf return buf def readmsg(sock): msgtype = readn(sock, 4).decode('ascii') size = struct.unpack('!I', readn(sock, 4)) data = readn(sock, size) return msgtype, size, data

Obviously, if you want to handle multiple users at the same time, you cannot intercept blocking recv calls like this. So what can you do?

If you use streams, you do not need to do anything with this code; just run each client in a separate thread, and everything is in order. It is like magic. The problem with threads is that you cannot start 5000 of them at the same time without slowing down your scheduler to bypass, allocating so much stack space that you enter swap hell, etc. So the question is, how do we get the magic of threads without problems?

Implicit potions are one answer to the problem. Basically, you write threaded code, it is actually launched by a joint scheduler, which interrupts your code every time you make a blocking call. The problem is that this includes monkeypatching all known blocking calls and hoping that no libraries that you install add any new ones.

Coroutines is the answer to this problem. If you are explicitly tagged with every call to the lock function, dropping the yield from to it, no one should render anything harmless. You still need to have functions compatible with asynchronous calls, but it is no longer possible to block the entire server without waiting for it, and it is much clearer from your code what is happening. The disadvantage is that the code of the reactor under the covers should be more complex ... but this is what you write once (or, better, zero time, because it is part of the structure or stdlib).

With callbacks, the code you write will ultimately do the same as with coroutines, but the complexity is now inside your protocol code. You must effectively manage the flow of control inside out. The most obvious translation is pretty awful by comparison:

 def readn(sock, n, callback): buf = '' def on_recv(newbuf): nonlocal buf, callback if not newbuf: callback(None, some error) return buf += newbuf if len(buf) == n: callback(buf) async_read(sock, n - len(buf), on_recv) async_read(sock, n, on_recv) def readmsg(sock, callback): msgtype, size = None, None def on_recv_data(buf, err=None): nonlocal data if err: callback(None, err) callback(msgtype, size, buf) def on_recv_size(buf, err=None): nonlocal size if err: callback(None, err) size = struct.unpack('!I', buf) readn(sock, size, on_recv_data) def on_recv_msgtype(buf, err=None): nonlocal msgtype if err: callback(None, err) msgtype = buf.decode('ascii') readn(sock, 4, on_recv_size) readn(sock, 4, on_recv_msgtype)

Now, obviously, in real life, everyone who writes the callback code must be shot; there are much better ways to organize it, for example, using futures or deferrals, using a class with methods instead of linking local closures defined in the reverse order with non-local operators, etc.

But the fact is that there is no way to write it in a way that even looks remotely like a synchronous version. The control flow is inherently central, and the protocol logic is secondary. With coroutines, since the control flow is always "reverse", it is not explicit in your code at all, and the protocol logic is all you need to read and write.

It is believed that there are many places where the best way to write something with callbacks is better than the coroutine (or synchronous) version, because the whole point of the code combines asynchronous events together.

If you read the Twisted tutorial, you will see that it is not so difficult to get these two mechanisms to play well together. If you write “Delayed” all around, you can freely use deferred functions, explicit callbacks, and @inlineCallbacks -line coroutines. In some parts of your code, control flow is important, and logic is trivial; in other parts, the logic is complex and you do not want it to be closed by the flow of control. That way, you can use what makes sense in each case.

Actually it is worth comparing generator-as-coroutines with generator-iterators. Consider:

 def squares(n): for i in range(n): yield i*i def squares(n): class Iterator: def __init__(self): self.i = 0 def __iter__(self): return self def __next__(self): i, self.i = self.i, self.i+1 return i*i return Iterator(n)

In the first version, a lot of “magic” is hidden - the state of the iterator between next calls is not specified anywhere; it is implicit in the local frame of the generator function. And each time you execute yield , the state of the entire program can be changed before return returns. Nevertheless, the first version is obviously much clearer and simpler, because there is almost nothing to read, except for the actual logic of the operation of obtaining N squares.

Obviously, you will not want to put all the state in every program that you have ever written to the generator. But refusing to use generators in general, because they hide state transitions, would be like refusing to use a for loop, because it hides jumps in the counter program. And this is exactly the case with coroutines.

Why are callbacks ugly?

More articles: