Does python-memcached support serial hashing and binary protocol?

Python-memcached is the official supported memcached driver for Django.

Does he support

  • Consistent Hashing
  • Binary protocol

If so, how can I use these functions in Django? I could not find the documentation.

+4
source share
6 answers

Looking at the _get_server method on python-memcached v1.45, it seems that it does not use sequential hashing, but simply hash % len(buckets) .

The same goes for the binary protocol, python-memcache uses, as far as I can see in the source, only text commands.

+2
source

You might be able to use this: http://amix.dk/blog/post/19370

It encapsulates the python-memcache client class, so keys are allocated using sequential hashing.

EDIT - I am digging the source code for python-memcached 1.4.5, and it looks like it can actually support sequential hashing. Relevant Code:

 from binascii import crc32 # zlib version is not cross-platform def cmemcache_hash(key): return((((crc32(key) & 0xffffffff) >> 16) & 0x7fff) or 1) serverHashFunction = cmemcache_hash -- SNIP -- def _get_server(self, key): if isinstance(key, tuple): serverhash, key = key else: serverhash = serverHashFunction(key) for i in range(Client._SERVER_RETRIES): server = self.buckets[serverhash % len(self.buckets)] if server.connect(): #print "(using server %s)" % server, return server, key serverhash = serverHashFunction(str(serverhash) + str(i)) return None, None 

Based on this code, it looks like it implements an algorithm if cmemcache_hash not a meaningful name and is not a real algorithm. (now resigned cmemcache does serial hashing)

But I think the OP refers to more "robust" sequential hashing, for example. libketama . I don’t think there is a solution there to solve this problem, it looks like you need to roll up your sleeves, compile / install a more advanced memcached lib, for example pylibmc, and write a custom Django backend that uses this instead of python-memcached.

Anyway, in any case, some reassignment of keys will happen when you add / remove buckets to the pool (even with libketama, which is less than with other algorithms)

+1
source

Please check this sample python implementation for sequential hashing. continnum-circle

main implementation principle : imagine a circle of continents with many replicated server points located along it. When we add a new server, 1 / n of the total cache keys will be lost

  '''consistent_hashing.py is a simple demonstration of consistent hashing.''' import bisect import hashlib class ConsistentHash: ''' To imagine it is like a continnum circle with a number of replicated server points spread across it. When we add a new server, 1/n of the total cache keys will be lost. consistentHash(n,r) creates a consistent hash object for a cluster of size n, using r replicas. It has three attributes. num_machines and num_replics are self-explanatory. hash_tuples is a list of tuples (j,k,hash), where j ranges over machine numbers (0...n-1), k ranges over replicas (0...r-1), and hash is the corresponding hash value, in the range [0,1). The tuples are sorted by increasing hash value. The class has a single instance method, get_machine(key), which returns the number of the machine to which key should be mapped.''' def __init__(self,replicas=1): self.num_replicas = replicas def setup_servers(self,servers=None): hash_tuples = [(index,k,my_hash(str(index)+"_"+str(k))) \ for index,server in enumerate(servers) for k in range(int(self.num_replicas) * int(server.weight)) ] self.hash_tuples=self.sort(hash_tuples); def sort(self,hash_tuples): '''Sort the hash tuples based on just the hash values ''' hash_tuples.sort(lambda x,y: cmp(x[2],y[2])) return hash_tuples def add_machine(self,server,siz): '''This mathod adds a new machine. Then it updates the server hash in the continuum circle ''' newPoints=[(siz,k,my_hash(str(siz)+"_"+str(k))) \ for k in range(self.num_replicas*server.weight)] self.hash_tuples.extend(newPoints) self.hash_tuples=self.sort(self.hash_tuples); def get_machine(self,key): '''Returns the number of the machine which key gets sent to.''' h = my_hash(key) # edge case where we cycle past hash value of 1 and back to 0. if h > self.hash_tuples[-1][2]: return self.hash_tuples[0][0] hash_values = map(lambda x: x[2],self.hash_tuples) index = bisect.bisect_left(hash_values,h) return self.hash_tuples[index][0] def my_hash(key): '''my_hash(key) returns a hash in the range [0,1).''' return (int(hashlib.md5(key).hexdigest(),16) % 1000000)/1000000.0 
+1
source

Vbucket is now suitable for allowing consistent hashing with minimal impact on cache skips.

0
source

If you want to use the plug-and-play solution for django, use django-memcached-hashring : https://github.com/jezdez/django-memcached-hashring .

This is the adapter around the django.core.cache.backends.memcached.MemcachedCache and hash_ring library.

0
source

I used a sequential hash algorithm. Lost keys are 1 / n of the total number of keys. This means that a successful key retrieval will be 6/7 * 100 about 85%. here

0
source

Source: https://habr.com/ru/post/1306600/


All Articles