Distance Jacquard

I have this problem when calculating Jaccard distance for sets (bit vectors):

p1 = 10111;

p2 = 10011.

Intersection size = 3; (How could we find out?)

Pool size = 4, (How could we find out?)

similarity Jaccard = (intersection / union) = 3/4.

Jaccard Distance = 1 - (similarity to Jaccard) = (1-3 / 4) = 1/4.

But I do not understand how we can detect the "intersection and " union "of two vectors.

Please help me.

Thank you so much.

+3
source share
2 answers

Intersection size = 3; (How could we find out?)

Number of Set Bits p1&p2 = 10011

Pool size = 4, (How could we find out?)

Number of Set Bits p1|p2 = 10111

The vector here means a binary array, where the i-th bit means the i-th element present in this set.

+6

p1 = 10111 p2 = 10011,

p1 p2:

  • M11 = , p1 p2 1,
  • M01 = , p1 0, p2 1,
  • M10 = , p1 1 p2 0,
  • M00 = , p1 p2 0.

Jaccard = J = / = M11/(M01 + M10 + M11) = 3/(0 + 1 + 3) = 3/4,

= J '= 1 - J = 1 - 3/4 = 1/4, J '= 1 - (M11/(M01 + M10 + M11)) = (M01 + M10)/(M01 + M10 + M11) = (0 + 1)/(0 + 1 + 3) = 1/4

+2

Source: https://habr.com/ru/post/1778990/


All Articles