Majordomo Bandwidth Measurement

Question

Majordomo Bandwidth Measurement

I am testing majordomo broker bandwidth. Test_client.c, which comes with majordomo code on github, sends a synchronous request. I want to check the maximum bandwidth that the Majordomian broker can achieve. Specifications ( http://rfc.zeromq.org/spec:7 ) say that it can switch up to a million messages per second.

First, I changed the client code to send 100k requests asynchronously. Even after installing HWM on all sockets high enough and increasing the TCP buffers to 4 MB, I observed packet loss with three clients working in parallel.

So, I changed the client to immediately send 10k requests, and then send two requests for each response received. I chose 10 thousand, because it allowed me to launch up to ten clients (each of which sent 100 thousand messages) in parallel without losing packets. Here is the client code:

#include "../include/mdp.h" #include <time.h> int main (int argc, char *argv []) { int verbose = (argc > 1 && streq (argv [1], "-v")); mdp_client_t *session = mdp_client_new (argv[1], verbose); int count1, count2; struct timeval start,end; gettimeofday(&start, NULL); for (count1 = 0; count1 < 10000; count1++) { zmsg_t *request = zmsg_new (); zmsg_pushstr (request, "Hello world"); mdp_client_send (session, "echo", &request); } for (count1 = 0; count1 < 45000; count1++) { zmsg_t *reply = mdp_client_recv (session,NULL,NULL); if (reply) { zmsg_destroy (&reply); zmsg_t *request = zmsg_new (); zmsg_pushstr (request, "Hello world"); mdp_client_send (session, "echo", &request); request = zmsg_new (); zmsg_pushstr (request, "Hello world"); mdp_client_send (session, "echo", &request); } else break; // Interrupted by Ctrl-C } /* receiving the remaining 55k replies */ for(count1 = 45000; count1 < 100000; count1++) { zmsg_t *reply = mdp_client_recv (session,NULL,NULL); if (reply) { zmsg_destroy (&reply); } else break; } gettimeofday(&end, NULL); long elapsed = (end.tv_sec - start.tv_sec) +((end.tv_usec - start.tv_usec)/1000000); printf("time = %ld\n", elapsed); printf ("%d replies received\n", count1); mdp_client_destroy (&session); return 0; }

I launched a broker, employee and clients on the same machine. Here is the recorded time:

 number of clients in parallel (each client sends 100k ) Time elapsed (seconds) 1 4 2 9 3 12 4 16 5 21 10 43

So, for every 100 thousand requests, the broker takes about 4 seconds. Is this expected behavior? Not sure how to reach a million messages per second.

LAST UPDATE:

I came up with an approach to increase system bandwidth:

Two brokers instead of one. One broker (broker1) is responsible for sending client requests to workers, and the other broker (broker2) is responsible for sending customer responses to clients.
Workers register with a broker1.
Clients generate a unique identifier and register with broker2.
Along with the request, the client also sends its unique identifier to broker1.
The worker extracts the unique identifier of the client from the request and sends its response (together with the identifier of the client to whom the response should be sent) to broker2.

Now every 100 thousand requests take about 2 seconds instead of 4 seconds (when using one broker). I added gettimeofday calls in the broker code to determine how much latency has been added by the broker itself.

Here is what I wrote

100 thousand requests (total time: ~ 2 seconds) → the latency added by brokers is 2 seconds.
200 thousand requests (total time: ~ 4 seconds) → the latency added by brokers is 3 seconds.
300 thousand requests (total time: ~ 7 seconds) → the latency added by brokers is 5 seconds.

Thus, most of the time is spent on the broker code. Can someone please suggest how to improve this.

+5

sockets distributed distributed-computing zeromq

user1274878 Dec 15 '14 at 5:48

source share

1 answer

André caron · Answer 1 · 2014-12-17T14:26:28+0000

Maximum throughput is linked by a maximum throughout the broker, but it is also limited by the maximum of the entire workforce.

It seems to me that you start only one worker. If you carefully read the specification of the majordomo protocol, it says that the broker should be able to switch to millions of messages per second, but this does not guarantee that one worker can process millions of requests per second.

Given that the worker processes only one request at a time and uses a completely synchronous dialogue with the workers (the broker does not send another request until he receives a response), it is impossible to compress most of the broker using a single worker or even one worker, for that matter: full A response over the network with a broker is performed between the response and the next request, so the employee spends most of his time.

First, you should try to add more workers :-)

EDIT : quick demo.

I started the echo service using the Python implementation of the Majordomo template with ZeroMQ. Of course, this Python code will check for full error checking, and the C version should be much faster, but this shows how several responders respond to responses.

Test setup :

1 customer
the client sends several 1000 outstanding requests at once (1000 is sent first, then the client sends one new request each time it receives a response);
the client sends a total of 100,000 requests;

results

Using 1 worker, the client receives 100,000 responses in 13.6 seconds (~ 7300 RPS).
Using 2 workers, clients receive 100,000 responses in 8.0 seconds (~ 12,500 RPS).
Using 3 workers, the client receives 100,000 responses in 7.8 seconds (~ 13000 RPS).

Using more clients will give better throughput in the broker, because it will do I / O on multiple sockets, but it is not so easy to calculate the total throughput of the broker, so I will leave it to you.

Majordomo Bandwidth Measurement

More articles: