Simple non-network concurrency with Twisted

I have a problem using Twisted for simple concurrency in python. The problem is that I do not know how to do this, and all online resources relate to the capabilities of the Twisted Network. So I turn to the SO guru for some guidance.

Used Python 2.5.

A simplified version of my problem works as follows:

  • A bunch of scientific evidence
  • A function that iterates over data and creates output
  • ??? <concurrency is introduced here, it takes pieces of data from 1 and passes it to 2
  • Exit from 3 is combined and saved.

I guess Twisted reactor can do the number three job. But how?

Thanks so much for any help and suggestions.

upd1:

A simple code example. I don’t know how the reactor works with processes, so I gave it imaginary functions:

 datum = 'abcdefg' def dataServer(data): for char in data: yield chara def dataWorker(chara): return ord(chara) r = reactor() NUMBER_OF_PROCESSES_AV = 4 serv = dataserver(datum) id = 0 result = array(len(datum)) while r.working(): if NUMBER_OF_PROCESSES_AV > 0: r.addTask(dataWorker(serv.next(), id) NUMBER_OF_PROCESSES_AV -= 1 id += 1 for pr, id in r.finishedProcesses(): result[id] = pr 
+4
source share
4 answers

It seems to me that you misunderstand the basics of Twisted. I recommend you give Twisted Intro a chance to Dave Peticolas . It helped me a lot and I have been using Twisted for many years!

Hint: Everything in Twisted depends on the reactor !

The reactor loop
(source: krondo.com )

+1
source

As Jean-Paul said, Twisted is great for coordinating several processes. However, if you do not need to use Twisted and just need a distributed processing pool, there may be more suitable tools out there.

One that I can think of that was not mentioned is celery . Celery is a distributed task queue - you set up a queue of tasks executing DB, Redis or RabbitMQ (you can choose from several options of free software) and write a number of computational tasks. These may be tasks of arbitrary scientific calculation. Tasks can generate sub-tasks (completing your β€œjoin” step mentioned above). Then you start as many workers as you need, and calculate them.

I am a heavy user of Twisted and Celery, so both options are good anyway.

+4
source

To actually compute things at the same time, you probably have to use several Python processes. One Python process can alternate between calculations, but it will not execute them in parallel (with some exceptions).

Twisted is a good way to coordinate these multiple processes and collect their results. One library focused on solving this problem is Ampoule. More information about Ampoule can be found on the Launchpad page: https://launchpad.net/ampoule .

+3
source

Do you need Twisted at all?

From your description of the problem, I would say that multiprocessing will match the score. Create multiple Process objects that reference a single Queue instance. Ask them to start their work and put their results on Queue . Just use get() lock to read the results.

+2
source

Source: https://habr.com/ru/post/1305469/


All Articles