If I have a dict, for example { key1 : value1, key2 : value2,..., key17:value17 }
, and 2 keys give the same hash, say key13 and key5 both give 12 when hashing, since I understand that python implements a conflict resolution method (open addressing if I'm not mistaken), to solve this problem. So, let's say value5 will be stored in index 12, and value 13 will be stored in another public index, determined by the conflict resolution method.
Here the hard part that I get confused: To get the value (for example, from key5), does the CPython interpreter hash the key and extract the value from the HASHVALUE index? This cannot be right, because then how does the interpreter know if value 13 belongs to key5, or is it in a different index due to a collision?
I tried to look at the C code from https://github.com/python/cpython/blob/master/Objects/dictobject.c#L1041
and the function seems
PyObject *
PyDict_GetItem(PyObject *op, PyObject *key)
{
Py_hash_t hash;
PyDictObject *mp = (PyDictObject *)op;
PyDictKeyEntry *ep;
PyThreadState *tstate;
PyObject **value_addr;
if (!PyDict_Check(op))
return NULL;
if (!PyUnicode_CheckExact(key) ||
(hash = ((PyASCIIObject *) key)->hash) == -1)
{
hash = PyObject_Hash(key);
if (hash == -1) {
PyErr_Clear();
return NULL;
}
}
tstate = (PyThreadState*)_Py_atomic_load_relaxed(
&_PyThreadState_Current);
if (tstate != NULL && tstate->curexc_type != NULL) {
PyObject *err_type, *err_value, *err_tb;
PyErr_Fetch(&err_type, &err_value, &err_tb);
ep = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr);
PyErr_Restore(err_type, err_value, err_tb);
if (ep == NULL)
return NULL;
}
else {
ep = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr);
if (ep == NULL) {
PyErr_Clear();
return NULL;
}
}
return *value_addr;
}
but my knowledge of C is very small, and I frankly do not understand what half of this says.