Finding approximate matches in lists: hashing complex objects in python

I am developing a python application that basically performs the following task:

Inside, find a list of items, say A. Go to another list of items, say B, and for each item Bin the list, Bfind the corresponding item Ain A. There should always be such an element - if there is no match, the program will fail.

Naturally, it seems that this will be a great example of a binary search application in a hash-ordered data structure.

The problem is that the items in lists are complex objects. Suppose that each record, Aor Bby itself, is a list of about 10 vectors (sometimes more and sometimes less) with this form:

vector = [ id, status, px, py, pz ]

where idand statusare integer values, and px, pyand pzare floating point values, so the element may look like this:

aExample = [ [   2, -1,  0.5,  0.7,  0.9 ], 
             [  -1, -1, -0.4, -0.6, -0.8 ],
             [  25,  2,  1.1,  1.3, -1.7 ],
             [  24,  2,  1.2,  1.1,  1.6 ],
             [ -24,  2,  0.9,  0.8,  2.1 ],
             [  11,  1,  1.2,  1.3,  2.6 ],
             [ -11,  1,  1.4,  1.2,  2.4 ],
             [  13,  1,  1.8,  1.6,  2.1 ],
             [ -11,  1,  3.2,  0.1,  3.6 ] ]

The list Acontains several hundred thousand of such records; the list Bcontains a couple of ten thousand.

To add more complications,

  • To match, it is necessary that the number of vectors be the same, but not their order
  • To match all values idand statusmatch
  • px, py, pz, , , .

, , , .

, , , , " " . : - , , , , , ? - , ?

+4
1

:

  • id, , px, py, pz
  • Vector hash() , :

    def eq(self, other): self.id == other.id and self.status == other.status and bucketOf(self.px,self.py,self.pz) == bucketOf(other.px, other.py, oyther.pz)

    def hash(): hash(id + status + bucketOf(px + py + pz) )

    , - -, + , , bucketOf /id px, py pz,

, Vector .

, Vector, Vectors , true.

, eq, .

HashMap Java.

. , , eq.

Java, , , Python

0

Source: https://habr.com/ru/post/1611087/


All Articles