The matrix U returned by X.computeSVD has dimensions mxk, where m is the number of rows of the original (distributed) RowMatrix X. One would expect m to be large (possibly larger than k), so it is not practical to collect this in the driver if we want our code to scale to really large m values.
I would say that both of these solutions below suffer from this drawback. The answer @ Alexander Kharlamov calls val U = svd.U.toBlockMatrix().toLocalMatrix() , which collects the matrix in the driver. The same thing happens with @ Climbs_lika_Spyder (By the way, your nickname is stones !!), which calls svd.U.rows.collect.flatMap(x => x.toArray) . I would rather rely on distributed matrix multiplication, for example, on the Scala code posted here .
source share