Cosmos Db Graph - Gremlin.Net Performance and Bandwidth vs. Microsoft.Graph

As I found out how to use the graph with the Cosmos database, I found two Microsoft tutorials:

As long as I use the same query, its execution is different.

Using Gremlin.Net, it runs immediately. I very often (I would say 70% of the time) received a RequestRateTooLargeException . If I understand correctly, this means that I continue to reach the limit of 400RU / s, which I decided to start. However, when the request goes to the trough, it is twice as fast as the solution with Microsoft.Azure.Graph.

In fact, with Micorosft.Azure.Graph, I have to call ExecuteNextAsync in a loop that returns one result at a time.

So the questions are:
1 °) What method should be used to increase productivity?
2 °) How can I find out RU of my request so that I can configure it?
3 °) Is it possible to increase the bandwidth of the existing collection?

Update

To question 3, I found that in the “Data Explorer” forest of my database for my graph there is “Scale and Settings”, where I can update the bandwidth.

Update2

To question 2, we cannot get the RU charged when using the first method (Gremlin.Net), but the Microsoft.Graph ExecuteNextAsync method returns a FeedResponse with the RequestCharge field.

+5
source share
1 answer

The reason you remove the RequestRateTooLarge exception (status code 429) through Gremlin.NET and Microsoft.Azure.Graphs is most likely due to the difference between the retry policy on the CosmosDB Gremlin server and the default retry policy for DocumentClient.

The default reuse behavior for DocumentClient regarding these errors is described here :

By default, a DocumentClientException with a status of 429 is returned after a cumulative wait time of 30 seconds if the request continues to run above the request speed.

Therefore, Microsoft.Azure.Graphs can handle internal processing and repeat these errors from the server and ultimately succeed. This improves query reliability, but confuses query speed errors and will affect execution time.

On CosmosDB Gremlin, this re-policy allowance is greatly reduced, so RequestRateTooLargeException errors will pop up earlier.

To answer your questions:

1. What method should be used to increase productivity?

Using CosmosDB Gremlin server through Gremlin.NET is expected to improve performance. Microsoft.Azure.Graphs takes a different approach to request processing, which includes a larger number of server calls, so it has overhead as well as the number of issues per server deployment.

2. How can I find out RU of my request so that I can configure it?

RU payments will be included in the Gremlin server responses as attributes. Gremlin.NET currently does not have the ability to subject attributes to responses, however, changes to the client driver are discussed here .

At the same time, you observe how often your queries get 429 errors through a Metrics click on your Azure CosmosDB Account portal. Here are aggregated representations of the number of requests, requests that exceed capacity, storage quotas, etc. For this collection.

3. Is it possible to increase the bandwidth of an existing collection?

As you find, you can increase the bandwidth for your existing collection through the portal. Alternatively, this can be programmatically via the Microsoft.Azure.Documents SDK.


In conclusion, my recommendation would be to add a Gremlin.NET request retry policy to handle these exceptions and match the RequestRateTooLargeException message.

When response status attributes are displayed on Gremlin.NET, they will include:

  • Request for payment,
  • CosmosDB special status code (e.g. 429) and
  • Retry-after value indicating timeout to avoid 429 errors.
+3
source

Source: https://habr.com/ru/post/1275583/


All Articles