Generate a hash sum for multiple integers

I ran into the problem of having multiple integers, and I have to generate them using them. For example.

Int 1: 14
Int 2: 4
Int 3: 8
Int 4: 4

Hash Sum: 43

I have some restriction in the values, the maximum value that an attribute can have is 30, adding all of them is always 30. And the attributes are always positive.

The key is that I want to generate the same hash amount for similar integers, for example, if I have integers, 14, 4, 10, 2, then I want to generate the same hash amount in the case above 43. But, of course, if the integers are very different (4, 4, 2, 20), then I should have a different hash amount. It should also be fast.

Ideally, I would like the hash sum output to be between 0 and 512, and it should be evenly distributed. With my limitations, I can have about 5K of different capabilities, so what I would like to have is about 10 for each bucket.

I am sure there are many algorithms that do this, but I could not find a way to find this thing. Can someone send an algorithm for this?

Additional Information

The thing is that these integers are attributes for the function. I want to store function values ​​in a table, but I do not have enough memory to store all the different parameters. This is why I want to generalize between similar attributes.

The reason 10, 5, 15 is completely different from 5, 10, 15, because if you represent it in 3d, then both points are a completely different point

Additional Information 2

, . , . , . 3 , 3d, .

if (att[0] < 5 && att[1] < 5 && att[2] < 5 && att[3] < 5)
     Block = 21


if ( (5 < att[0] < 10) &&  (5 < att[1] < 10) &&  (5 < att[2] < 10) &&  (5 < att[3] < 10))
     Block = 45

, ifs, .

+3
8

a, b, c d, 0 30 (5 ), 0 255 (8 ).

bucket = ((a & 0x18) << 3) | ((b & 0x18) << 1) | ((c & 0x18) >> 1) | ((d & 0x18) >> 3)

, , , . 3 , 0-7 , 8-15 ..

0-7,0-7,0-7,0-7 -> bucket 0
0-7,0-7,0-7,8-15 -> bucket 1
0-7,0-7,0-7,16-23 -> bucket 2
...
24-30,24-30,24-30,24-30 -> bucket 255

:

for (int a = 0; a <= 30; a++)
    for (int b = 0; b <= 30; b++)
        for (int c = 0; c <= 30; c++)
            for (int d = 0; d <= 30; d++) {
                int bucket = ((a & 0x18) << 3) |
                             ((b & 0x18) << 1) |
                             ((c & 0x18) >> 1) |
                             ((d & 0x18) >> 3);
                printf("%d, %d, %d, %d -> %d\n",
                         a,  b,  c,  d,   bucket);
            }
+4

:

, , , (md5, sha ..).

, - :

  • P
  • 0 < a [i] P ( )

, : sum (a [i] * x [i]) mod P

+5

, - ? , 50 5 5 10 5 5 10 50 , , 52 7 4 12 50 5 5 10? - :

long hash = 13;
for (int i = 0; i < array.length; i++) {
    hash = hash * 37 + array[i] / 5;
}

, , . 50 - 54 , 49 50 .

, ( 5 10 20 20 10 5 ), - .

    hash = hash * 37 + array[i] / 5;

    hash += array[i] / 5;

EDIT: , , . , . , .

, , 5 10 20 20 10 5. , "" -, , .

- , . . "" , , . , -, -, -, .

+2

, , , .

EDIT: , , , . .

, .

, memoizing, , , .

+1

, "". , , .

- .

0

, . .

, , , . , . , :

int SqueezedSum( int a, int b, int c, int d )
{
    return (a/11) + (b/7) + (c/5) + (d/3);
}

, , .

0

. ""

With geometric hashing, you suspect number 3 with something that is almost the opposite; namely, close initial values ​​give close hash values.

0
source

Another way to address my problem is to use multiple scaling (MS). In MS, we start with a matrix of elements, and we want to assign the location of each element to an N-dimensional space. Thus reducing the number of measurements.

http://en.wikipedia.org/wiki/Multidimensional_scaling

0
source

Source: https://habr.com/ru/post/1703065/


All Articles