Why are Ruby #hash methods randomized?

Question

Why are Ruby #hash methods randomized?

I just noticed that the return value #hashchanges every time Ruby starts:

$ irb
2.0.0-p353 :001 > "".hash
2313425349783613115
2.0.0-p353 :002 > exit

$ irb
2.0.0-p353 :001 > "".hash
4543564897974813688
2.0.0-p353 :002 > exit

I looked at the source of the MRI to find out why this is happening:

st_index_t
rb_str_hash(VALUE str)
{
    int e = ENCODING_GET(str);
    if (e && rb_enc_str_coderange(str) == ENC_CODERANGE_7BIT) {
    e = 0;
    }
    return rb_memhash((const void *)RSTRING_PTR(str), RSTRING_LEN(str)) ^ e;
}

It turns out is rb_memhashdefined in random.c:

st_index_t
rb_memhash(const void *ptr, long len)
{
    sip_uint64_t h = sip_hash24(sipseed.key, ptr, len);
#ifdef HAVE_UINT64_T
    return (st_index_t)h;
#else
    return (st_index_t)(h.u32[0] ^ h.u32[1]);
#endif
}

And although I cannot find that ruby_sip_hash24, I assume that this is not a deterministic function.

After a bit of a mess, I managed to find this fix by Tanaka Akira, which changes rb_str_hashto use rb_memhashdue to "avoid attacks of algorithmic complexity." What does it mean?

Thank!

+4

ruby

ucarion Apr 28 '14 at 2:45

source share

1 answer

xdazz · Accepted Answer · 2014-04-28T02:56:38+0000

, .

- , , , . .

rb_memhash, - ruby. , , , DoS.

Why are Ruby #hash methods randomized?

More articles: