I think what you are looking for and should use aggregateByKey.
. . :
:
val (accZeroY, accZeroZ): (Long, Long) = (0, 0)
val mappedDataRDD = dataRDD
.map({
case (v, w, x, y, z) => ((v,w,x), (y, z))
})
.aggregateByKey((accZeroY, accZeroZ))(
{ case ((accY, accZ), (y, z)) => (accY + y, accZ + z) }
{ case ((accY1, accZ1), (accY2, accZ2)) => (accY1 + accY2, accZ1 + accZ2) }
)
, . , type of the needed accumulation key-value-RDD PairRDD.
reduceByKey, aggregateByKey , ,
val mappedDataRDD = dataRDD
.map({
case (v, w, x, y, z) => ((v,w,x), (y, z))
})
.reduceByKey(
{ case ((accY, accZ), (y, z)) => (accY + y, accZ + z) }
)
, , should NOT reduceByKey. , aggregateByKey, , - , .
, , (x, y) (Int, Int), (v, w, x) . , Int ... , , Int.
... , - , (Int, Int), (Long, Long) reduceByKey . ... , , , aggregateByKey