Azure Service Fabric InvokeWithRetryAsync huge overhead

I am currently working on a Service Fabric microservice, which should have high bandwidth.

I wondered why I cannot receive more than 500 messages 1 KB per second on my workstation using loopback.

I removed all the business logic and plugged in a performance profiler to measure performance to the end.

It seems that 96% of the time spent on resolving the Client, and only ~ 2% fulfill actual Http requests.

I call Submit in a closed loop for the test:

private HttpCommunicationClientFactory factory = new HttpCommunicationClientFactory(); public async Task Send() { var client = new ServicePartitionClient<HttpCommunicationClient>( factory, new Uri("fabric:/MyApp/MyService")); await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + "/test")); } 

Profiler Output

Any ideas on this? According to the documentation, what I call Services seems like Fabric Fabric best practice.

UPDATE ServicePartioningClient caching improves performance, but using partioned services, I cannot cache the client, since I don’t know the section for providing PartitionKey.

UPDATE 2 . I'm sorry that I did not include the details in my original question. We noticed the huge overhead of InvokeWithRetry when we initially introduced socket-based communications.

You will not notice this if you use HTTP requests. The HTTP request is already taking ~ 1 ms, so adding 0.5 ms to InvokeWithRetry is not remarkable.

But if you use raw sockets, which in our case take ~ 0.005 ms, adding 0.5 ms overhead for InvokeWithRetry, this is huge!

Here is an example http, with InvokeAndRetry it takes 3 times:

 public async Task RunTest() { var factory = new HttpCommunicationClientFactory(); var uri = new Uri("fabric:/MyApp/MyService"); var count = 10000; // Example 1: ~6000ms for (var i = 0; i < count; i++) { var pClient1 = new ServicePartitionClient<HttpCommunicationClient>(factory, uri, new ServicePartitionKey(1)); await pClient1.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url)); } // Example 2: ~1800ms var pClient2 = new ServicePartitionClient<HttpCommunicationClient>(factory, uri, new ServicePartitionKey(1)); HttpCommunicationClient resolvedClient = null; await pClient2.InvokeWithRetryAsync( c => { resolvedClient = c; return Task.FromResult(true); }); for (var i = 0; i < count; i++) { await resolvedClient.HttpClient.GetAsync(resolvedClient.Url); } } 

I know that InvokeWithRetry adds some nice things that I don't want to miss from clients. But do I need to allow partitions on every call?

0
source share
1 answer

I thought it would be nice to actually compare this and see what the difference really is. I am creating a basic configuration using the Stateful service, which opens the HttpListener and the client, which calls this service in three different ways:

  • Create a new client for each call and complete all calls in sequence

     for (var i = 0; i < count; i++) { var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1)); var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}")); } 
  • Create the client only once and reuse it for each call in the sequence

     var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1)); for (var i = 0; i < count; i++) { var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}")); } 
  • Create a new client for each call and run all calls in parallel

     var tasks = new List<Task>(); for (var i = 0; i < count; i++) { tasks.Add(Task.Run(async () => { var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1)); var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}")); })); } Task.WaitAll(tasks.ToArray()); 

Then I ran a test for several counters to get the average value form:

Duration of conversation for different models of clients and calls

Now, this needs to be done for what it is, and not a complete and comprehensive test in a controlled environment, there are a number of factors that will affect this performance, such as the cluster size that actually causes the called service (in this case, nothing really ), and the size and complexity of the payload (in this case, a very short string).

In this test, I also wanted to see how the behavior of Fabric Transport behaves and the performance is similar to HTTP transport (to be honest, I expected a little better, but this may not be visible in this trivial scenario).

It is worth noting that for parallel execution of 10,000 calls, performance was significantly reduced. This is probably due to the fact that the service runs out of working memory. The consequence of this may be that some customer calls fail and retry (to verify) after a delay. The duration measurement method is the total time until all calls are completed. At the same time, it should be noted that the test does not actually allow the service to use more than one node, since all calls are routed to the same section.

In conclusion, it should be noted that the reuse effect of the client is nominal, and for trivial calls, HTTP performs a similar Fabric transport.

+2
source

Source: https://habr.com/ru/post/1263737/


All Articles