What is an efficient and cheap algorithm for generating ETags?

I have a REST API (built into Nancy running on ASP.NET) that can return a JSON object as follows:

{ id: "1", name: "Fred", reviews: [ { id: "10", content: "I love Stack Overflow" } ] } 

Note how this object is not a direct object, but not a representation.

Usually I use the last changed / timestamp of the object in the database as ETag, and then when it is updated, ETag is updated. Just.

But in this case, what if the user does not change, but the contents of the first review change? Using the above ETag logic, it will not change. Here we have a case where a view includes several objects, and I'm trying to find a way to uniquely identify this.

So I need to somehow identify this view (which is a simple C # POCO stored in the Redis cache).

Here are my initial thoughts:

  • Object.GetHashCode() . Will not work, because the memory reference will always be different.
  • Object memory stream, SHA1 hash. It is worth doing every time.
  • Before adding / updating the cache, create the GUID that will be used for the ETag and save it in the cache. Then, when the cache is flushed (which would be in the previous example), a new GUID is generated and the ETag is updated. The problem with this approach is that I bind my ETag mechanism to my caching implementation (therefore not related to it).

Can anyone think of a cheap / efficient way to do this, ideally on a global level? (for example, Object or a base object instead of the specific ETag generation logic for each object / resource).

Thank you very much!

+5
source share
1 answer

I think hashing is not so bad. Extremely efficient hash algorithms exist, such as MurmurHash3 (128-bit) and xxHash (64-bit), which I would consider. This is an effective way to make this tapestry, but unfortunately it is not the cheapest. You can find C # implementations here and here .

You said that every object in the database has a modified timestamp. If the model consists of several objects, the ETag model can be obtained from entity timestamps. An ETag model would be a concatenation of entity timestamps. This approach is more efficient, but you cannot do it globally, you will need to write specific code for each model.

+1
source

Source: https://habr.com/ru/post/1232509/


All Articles