Boost :: asio: stream of local asynchronous events

I will create x number of threads in my application server. x will be the number of cores on the machine, and these threads will be (without a hyperthread) bound to the kernel. Naturally, with this scheme, I would like to distribute incoming connections over streams in order to ensure that as soon as a connection is assigned to a stream, it will only be served from this stream. How is this achieved in boost :: asio?

I think: one socket is associated with an address common to several io_service , where each thread gets its own io_service . Is this line of reasoning correct?

edit: it looks like I will have to answer this question myself.

+4
source share
3 answers

Yes, your reasoning is mostly true. You must create a thread on the kernel, an io_service instance for the thread, and call io_service.run () on each thread.

However, the question is whether you really will. These are the problems that I see:

  • Depending on how the work is balanced across your connections, you may get very busy cores and cores idling. Micro-optimization for getting into the cache in the kernel may mean that you lose the ability to work with an inactive kernel when the β€œoptimal” kernel is not ready.

  • At socket speeds (i.e. slow), how many wins do you get from CPU cache hits? If a single connection requires enough CPU to keep the core busy, and you only have as many connections as there are cores, then fine. Otherwise, the inability to move the work to cope with deviations in the workload can lead to any victory obtained from cash hits. And if you do many different jobs in each thread, the cache will still not be hot.

  • If you just do I / O, winning the cache may not be that big, regardless. Depends on your actual workload.

My recommendation would be to have one io_service instance and call io_service.run () on the thread to the kernel. If you get inadequate performance or have connection classes that have a large number of CPUs for each connection, and you can get a cache gain, move them to specific io_service instances.

This is the case when you need to perform profiling to find out how many misses in caches are worth, and where.

+5
source

If your server application should run on a Windows computer, you should consider using I / O completion ports.

It is able to limit the number of active threads to the number of cores. It distributes I / O events from a theoretically infinite number of sockets to active threads. Planning is done by the OS. Here is a good example of how to do this.

+2
source

You can use one io_service , which is used by several threads and strand , to ensure that the connection is always processed by the same thread. Take a look at Example HTTP Server 3.

0
source

Source: https://habr.com/ru/post/1302218/


All Articles