The hash function in the list, independent of the order of elements in it

I want to have a dictionary that assigns a value to an integer.

For example, key is [1 2 3] , and value will have a specific value.

The fact is that [3 2 1] needs to be handled the same way in my case, so the hash should be equal if I go with the hash approach.

The set will contain from 2 to 10 elements.

The sum of the elements is usually fixed, so we cannot make a hash code according to the sum, which is the first natural idea here.

Not a homework job that really runs into this problem in my code.

This set is basically an IEnumerable<int> in C #, so any data structure stores them perfectly.

Any help appreciated. Performance is also important here.

Immediate thought: we could summarize items^2 and already get some better hash, but still I would like to hear some thoughts.

EDIT: hmm, sorry guys, they all offer to order, it never occurred to me that I need to say that in fact ordering and hashing is the current solution that I am using and I am considering faster alternatives.

+4
source share
9 answers

In principle, all approaches here are instances of the same template. The map x 1 , ..., x n in f (x 1 ) op ... op f (x n ), where op is a commutative associative operation on some set X, and f is a mapping from elements to X. This template used a couple of times in ways that are supposedly good.

Modular exponentiation is slow, so do not use it. As for Zobrist hashing, with 3 million items, table f probably won't fit in L2, although it sets the upper bound of a single access to main memory.

Instead, I would like to use Zobrist hashing as a starting point and look for a cheap function f that behaves like a random function. This is, in fact, a description of the operation of a non-cryptographic pseudo-random generator - I would try to calculate f by running a fast PRG with x and creating a single value.

EDIT: if all sets have the same sums, don’t choose f as a polynomial of degree 1 (for example, a step function of a linear congruent generator).

+4
source

Use HashSet<T> and HashSet<T>.CreateSetComparer() , which returns IEqualityComparer<HashSet<T>> .

+2
source

I think that what is mentioned in this article will definitely help:

http://people.csail.mit.edu/devadas/pubs/mhashes.pdf

Incremental multisets hash functions and their application for checking the integrity of memory

Abstract: We introduce a new cryptographic tool: multichannel hash functions. Unlike standard hash functions that accept strings as input, a multiset of hash functions works on multisets (or sets). They map multisets of arbitrary finite size onto strings (hashes) of a fixed length. They are incremental in that when new members are added to the multiset, the hash can be updated with time proportional to the change. These functions can be multiset resistant, because you can find two multisets in it that produce the same hash, or just collision resistant, in that there are many multisets that create the same hash.

+1
source

I think your square idea is going in the right direction, but a poor choice of function. I would try something more than PRNG functions or just multiplication by a large number, followed by the XOR of all the resulting values.

+1
source

One possibility: sort the items in a list, and then a hash.

0
source

You can sort the numbers and select a pattern from the given indices and leave the remainder as zero if the current value has fewer numbers. Or you could have them or whatever.

0
source

Why not something like

 public int GetOrderIndependantHashCode(IEnumerable<int> source) { return (source.Select(x => x*x).Sum() + source.Select(x => x*x*x).Sum() + source.Select(x => x*x*x*x).Sum()) & 0x7FFFFF; } 
0
source

If the range of values ​​in key limited to limited positive integers with a low value, you can match each one with a prime number using a simple search, and then multiply them together by value .

Using the example in the question:

 [1, 2, 3] maps to 2 x 3 x 5 = 30 [3, 2, 1] maps to 5 x 3 x 2 = 30 
0
source

Create your own type that implements IEnumerable<T> .

Replace GetHashCode . In it, sort your collection, call and return ToArray().GetHashCode() .

-1
source

Source: https://habr.com/ru/post/1381965/


All Articles