I am very new to Scala and I would like to translate my Java code with the same level of performance.
Given n float vectors and an extra vector, I have to calculate all n-point products and get the maximum.
Using Java is pretty simple for me
public static void main(String[] args) {
int N = 5000000;
int R = 200;
float[][] t = new float[N][R];
float[] u = new float[R];
Random r = new Random();
for (int i = 0;i<N;i++) {
for (int j = 0;j<R;j++) {
if (i == 0) {
u[j] = r.nextFloat();
}
t[i][j] = r.nextFloat();
}
}
long ts = System.currentTimeMillis();
float maxScore = -1.0f;
for (int i = 0;i < N;i++) {
float score = 0.0f;
for (int j = 0; i < R;i++) {
score += u[j] * t[i][j];
}
if (score > maxScore) {
maxScore = score;
}
}
System.out.println(System.currentTimeMillis() - ts);
System.out.println(maxScore);
}
The calculation time is 6 ms on my machine.
Now I have to do it with Scala
val t = Array.ofDim[Float](N,R)
val u = Array.ofDim[Float](R)
// Filling with random floats like in Java
val ts = System.currentTimeMillis()
var maxScore: Float = -1.0f
for ( i <- 0 until N) {
var score = 0.0f
for (j <- 0 until R) {
score += u(j) * t(i)(j)
}
if (score > maxScore) {
maxScore = score
}
}
println(System.currentTimeMillis() - ts)
println(maxScore);
The above code takes more than a second on my machine. My thought is that Scala does not have a primitive array structure, such as float [] in Java, and is replaced with a collection. Index access seems to be slower than the one that has a primitive array in Java.
The following code is even slower:
val maxScore = t.map( r => r zip u map Function.tupled(_*_) reduceLeft (_+_)).max
which takes 26s
How can I efficiently iterate over my 2 arrays to calculate this?
Thank you so much