Effectively comparing DB row values

I want to skip the document database and compute a paired comparison score.

The simplest, most naive method could put a loop in another loop. This will lead to the fact that the program compares documents twice, and also compares each document with itself.

Is there an algorithm to effectively complete this task? Is there a name for this approach?

Thank.

+3
source share
4 answers

Suppose all items have an ItemNumber

A simple solution - there is always a second ItemNumber element other than the first element.

eg,

for (firstitem = 1 to maxitemnumber)
  for (seconditem = firstitemnumber+1 to maxitemnumber)
    compare(firstitem, seconditem)

: ( ), .

........
x.......
xx......
xxx.....
xxxx....
xxxxx...
xxxxxx..
xxxxxxx.
+3

, .

, , - , .

:

SELECT a.item as a_item, b.item as b_item
FROM table AS a, table AS b
WHERE a.id<b.id

, , - soundex - , .

.

+2

, , . ( ;))

compared = set()

for i in [1,2,3]:
    for j in [1,2,3]:
        pair =  frozenset((i,j))
        if i != k and pair not in compared:
            compare.add(pair)
            compare(i,j)

, . , , , , , - .

:
, . , :

docs = [1,2,3]
l = len(docs)
for i in range(l):
    for j in range(i+1,l):
        compare(l[i],l[j])
0

- ?

src = [1,2,3]
for i, x in enumerate(src):
    for y in src[i:]:
        compare(x, y)

:

pairs = [(x, y) for i, x in enumerate(src) for y in src[i:]]
0

Source: https://habr.com/ru/post/1736504/


All Articles