How to sum each column of a Scala array?

If I have an array array (similar to a matrix) in Scala, what is the efficient way to sum each column of the matrix? For example, if my array array is similar below:

val arr =  Array(Array(1, 100, ...), Array(2, 200, ...), Array(3, 300, ...))

and I want to summarize each column (for example, sum the first element of all subarrays, summarize the second element of all subarrays, etc.) and get a new array, as shown below:

newArr = Array(6, 600, ...)

How can I do this efficiently in Spark Scala?

+4
source share
4 answers

Using breeze Vector :

scala> val arr =  Array(Array(1, 100), Array(2, 200), Array(3, 300))
arr: Array[Array[Int]] = Array(Array(1, 100), Array(2, 200), Array(3, 300))

scala> arr.map(breeze.linalg.Vector(_)).reduce(_ + _)
res0: breeze.linalg.Vector[Int] = DenseVector(6, 600)

If your entry is sparse, you can use breeze.linalg.SparseVector.

+4
source

List .transpose, ,

arr.toList.transpose.map(_.sum)

( .toArray, ).

+5

, @zero323, .

, col2sum, , , Array.reduce N . , , (.. 1 + 2 + 3 == 3 + 2 + 1 == 3 + 1 + 2 == 6):

def col2sum(x:Array[Int],y:Array[Int]):Array[Int] = {
    x.zipAll(y,0,0).map(pair=>pair._1+pair._2)
}

def colsum(a:Array[Array[Int]]):Array[Int] = {
    a.reduce(col2sum)
}

val z = Array(Array(1, 2, 3, 4, 5), Array(2, 4, 6, 8, 10), Array(1, 9));

colsum(z)

--> Array[Int] = Array(4, 15, 9, 12, 15)
+4
scala> val arr =  Array(Array(1, 100), Array(2, 200), Array(3, 300 ))
arr: Array[Array[Int]] = Array(Array(1, 100), Array(2, 200), Array(3, 300))

scala> arr.flatten.zipWithIndex.groupBy(c => (c._2 + 1) % 2)
       .map(a => a._1 -> a._2.foldLeft(0)((sum, i) => sum + i._1))

res40: scala.collection.immutable.Map[Int,Int] = Map(2 -> 600, 1 -> 6, 0 -> 15)

zipWithIndex, groupBy, , foldLeft .

0

Source: https://habr.com/ru/post/1609737/


All Articles