How to use middle function in neo4j with collection

I want to calculate the covariance of two vectors as a collection A = [1, 2, 3, 4] B = [5, 6, 7, 8]

Cov (A, B) = Sigma [(ai-AVGa) * (bi-AVGb)] / (n-1)

My problem for calculating covariance:

1) I cannot have a nested aggregate function when I write

SUM((ai-avg(a)) * (bi-avg(b)))

2) Or in another form, how can I extract two collections with one reduction, for example:

REDUCE(x= 0.0, ai IN COLLECT(a) | bi IN COLLECT(b) | x + (ai-avg(a))*(bi-avg(b)))

3) if it is impossible to extract two collections in oe, reduce how you can relate their value to calculate covariance when they are separated.

REDUCE(x= 0.0, ai IN COLLECT(a) | x + (ai-avg(a)))
REDUCE(y= 0.0, bi IN COLLECT(b) | y + (bi-avg(b)))

I mean, can I write a nested abbreviation?

4) Are there any ways to “unwind”, “extract”

Thank you for your help.

+4
source share
4

cybersam , n^2, UNWIND, :

WITH [1,2,3,4] AS a, [5,6,7,8] AS b
WITH REDUCE(s = 0.0, x IN a | s + x) / SIZE(a) AS e_a,
     REDUCE(s = 0.0, x IN b | s + x) / SIZE(b) AS e_b,
     SIZE(a) AS n, a, b
RETURN REDUCE(s = 0.0, i IN RANGE(0, n - 1) | s + ((a[i] - e_a) * (b[i] - e_b))) / (n - 1) AS cov;

Edit:

-, , UNWIND fooobar.com/questions/1621329/.... , UNWINDing k length-n Cypher n^k . , 3, .

> WITH [1,2,3] AS a, [4,5,6] AS b
UNWIND a AS aa
UNWIND b AS bb
RETURN aa, bb;
   | aa | bb
---+----+----
 1 |  1 |  4
 2 |  1 |  5
 3 |  1 |  6
 4 |  2 |  4
 5 |  2 |  5
 6 |  2 |  6
 7 |  3 |  4
 8 |  3 |  5
 9 |  3 |  6

n^k = 3^2 = 9. , 9 .

> WITH [1,2,3] AS a, [4,5,6] AS b
UNWIND a AS aa
UNWIND b AS bb
RETURN AVG(aa), AVG(bb);
   | AVG(aa) | AVG(bb)
---+---------+---------
 1 |     2.0 |     5.0

, , , . , {1,2,3} {1,2,3,1,2,3}. n, n, .

, 1000. UNWIND:

> WITH RANGE(0, 1000) AS a, RANGE(1000, 2000) AS b
UNWIND a AS aa
UNWIND b AS bb
RETURN AVG(aa), AVG(bb);
   | AVG(aa) | AVG(bb)
---+---------+---------
 1 |   500.0 |  1500.0

714

, REDUCE:

> WITH RANGE(0, 1000) AS a, RANGE(1000, 2000) AS b
RETURN REDUCE(s = 0.0, x IN a | s + x) / SIZE(a) AS e_a,
       REDUCE(s = 0.0, x IN b | s + x) / SIZE(b) AS e_b;
   | e_a   | e_b   
---+-------+--------
 1 | 500.0 | 1500.0

4

, -1000 :

> WITH RANGE(0, 1000) AS aa, RANGE(1000, 2000) AS bb
UNWIND aa AS a
UNWIND bb AS b
WITH aa, bb, SIZE(aa) AS n, AVG(a) AS avgA, AVG(b) AS avgB
RETURN REDUCE(s = 0, i IN RANGE(0,n-1)| s +((aa[i]-avgA)*(bb[i]-avgB)))/(n-1) AS
 covariance;
   | covariance
---+------------
 1 |    83583.5

9105

> WITH RANGE(0, 1000) AS a, RANGE(1000, 2000) AS b
WITH REDUCE(s = 0.0, x IN a | s + x) / SIZE(a) AS e_a,
     REDUCE(s = 0.0, x IN b | s + x) / SIZE(b) AS e_b,
          SIZE(a) AS n, a, b
          RETURN REDUCE(s = 0.0, i IN RANGE(0, n - 1) | s + ((a[i] - e_a) * (b[i
] - e_b))) / (n - 1) AS cov;
   | cov    
---+---------
 1 | 83583.5

33

+4

[]

( ), :

WITH [1,2,3,4] AS aa, [5,6,7,8] AS bb
UNWIND aa AS a
UNWIND bb AS b
WITH aa, bb, SIZE(aa) AS n, AVG(a) AS avgA, AVG(b) AS avgB
RETURN REDUCE(s = 0, i IN RANGE(0,n-1)| s +((aa[i]-avgA)*(bb[i]-avgB)))/(n-1) AS covariance;

, n , .

, @NicoleWhite @jjaderberg, n , . @NicoleWhite - .

+5

A B? avg REDUCE, . , , , , , . , , collect, A B, , avg. :

WITH [1, 2, 3, 4] AS aa UNWIND aa AS a
WITH collect(a) AS aa, avg(a) AS aAvg
RETURN aa, aAvg

WITH [1, 2, 3, 4] AS aColl UNWIND aColl AS a
WITH collect(a) AS aColl, avg(a) AS aAvg
WITH aColl, aAvg,[5, 6, 7, 8] AS bColl UNWIND bColl AS b
WITH aColl, aAvg, collect(b) AS bColl, avg(b) AS bAvg
RETURN aColl, aAvg, bColl, bAvg

, aAvg bAvg, , aColl bColl,

RETURN REDUCE(x = 0.0, i IN range(0, size(aColl) - 1) | x + ((aColl[i] - aAvg) * (bColl[i] - bAvg))) / (size(aColl) - 1) AS covariance
+3

Dears, ,

1) → @cybersam

2) → @Nicole White

3) (reset ) → @jjaderberg

BUT :

.

, = 1.6666666666666667

= 1,25

: https://www.easycalculation.com/statistics/covariance.php

Vector X: [1, 2, 3, 4] Vector Y: [5, 6, 7, 8]

enter image description here

enter image description here

I think that these differences are due to the fact that some calculations do not consider (n-1) as a divisor, but instead of (n-1) they simply use n. Therefore, with the growth of the divisor from n-1 to n, the result will decrease from 1.6 to 1.25.

enter image description here

0
source

Source: https://habr.com/ru/post/1621329/


All Articles