Scala - Iterate over two arrays

How do you iterate over two arrays of the same size, accessing the same index of each iteration. Scala Way โ„ข?

for ((aListItem, bListItem) <- (aList, bList)) { // do something with items } 

Java method applied to Scala:

  for(i <- 0 until aList.length ) { aList(i) bList(i) } 

Suppose both lists are the same size.

+6
source share
4 answers

tl; dr : there are trade-offs between speed and convenience; you need to know your use case in order to choose the right one.


If you know that both arrays are the same length, and you donโ€™t have to worry about how quickly this happens, the simplest and most canonical is to use zip inside the understanding:

 for ((a,b) <- aList zip bList) { ??? } 

However, the zip method creates a new single array. To avoid this overhead, you can use zipped in a tuple that will represent the elements in pairs, such as foreach and map :

 (aList, bList).zipped.foreach{ (a,b) => ??? } 

Arrays should be indexed faster, especially if arrays contain primitives, such as Int , since the generic code above should tag them. There is a convenient indices method that you can use:

 for (i <- aList.indices) { ??? } 

Finally, if you need to go as fast as you can, you can return to a manual loop or recursion, for example:

 // While loop var i = 0 while (i < aList.length) { ??? i += 1 } // Recursion def loop(i: Int) { if (i < aList.length) { ??? loop(i+1) } } loop(0) 

If you are calculating some value, and not a side effect, it is sometimes faster with recursion if you pass it:

 // Recursion with explicit result def loop(i: Int, acc: Int = 0): Int = if (i < aList.length) { val nextAcc = ??? loop(i+1, nextAcc) } else acc 

Since you can opt out of defining a method anywhere, you can use recursion without restriction. You can add the annotation @annotation.tailrec to make sure that it can be compiled to a fast loop with transitions instead of the actual recursion that eats the stack space.

Taking all these different approaches for calculating the point product along the length of 1024 vectors, we can compare them with the reference implementation in Java:

 public class DotProd { public static int dot(int[] a, int[] b) { int s = 0; for (int i = 0; i < a.length; i++) s += a[i]*b[i]; return s; } } 

plus the equivalent version, where we take a point product of string lengths (so we can evaluate objects against primitives)

 normalized time ----------------- primitive object method --------- ------ --------------------------------- 100% 100% Java indexed for loop (reference) 100% 100% Scala while loop 100% 100% Scala recursion (either way) 185% 135% Scala for comprehension on indices 2100% 130% Scala zipped 3700% 800% Scala zip 

This is especially bad, of course, with primitives! (You get similar huge transitions in time if you try to use an ArrayList from Integer instead of Array of Int in Java.) Note in particular that zipped is a pretty reasonable choice if you have objects saved.

Beware of premature optimization though! There are advantages to clarity and security in functional forms such as zip . If you always write in a loop because you think that โ€œevery bit helps,โ€ you are probably mistaken because it takes more time to write and debug, and you can use this time to optimize a more important part of your program.


But, if your arrays are the same length, this is dangerous. Are you sure? How much effort will you make to be sure? Maybe you should not make this assumption?

If you do not need fast, just fix it, then you will need to choose what to do if the two arrays do not have the same length.

If you want to do something with all elements up to a length shorter then zip is still used by you:

 // The second is just shorthand for the first (aList zip bList).foreach{ case (a,b) => ??? } for ((a,b) <- (aList zip bList)) { ??? } // This avoids an intermediate array (aList, bList).zipped.foreach{ (a,b) => ??? } 

If you instead want to overlay a shorter value with a default value, you should

 aList.zipAll(bList, aDefault, bDefault).foreach{ case (a,b) => ??? } for ((a,b) <- aList.zipAll(bList, aDefault, bDefault)) { ??? } 

In either of these cases, you can use yield with for or map instead of foreach to create a collection.

If you need an index to calculate or it really is an array, and you really need it to be fast, you have to do the calculation manually. Removing missing elements is inconvenient (I leave this as an exercise for the reader), but the main form is:

 for (i <- 0 until math.min(aList.length, bList.length)) { ??? } 

where you then use i to index in aList and bList .

If you really need maximum speed, you will again use (tail) recursion or while loops:

 val n = math.min(aList.length, bList.length) var i = 0 while (i < n) { ??? i += 1 } def loop(i: Int) { if (i < aList.length && i < bList.length) { ??? loop(i+1) } } loop(0) 
+11
source

Sort of:

 for ((aListItem, bListItem) <- (aList zip bList)) { // do something with items } 

Or using map as:

 (aList zip bList).map{ case (alistItem, blistItem) => // do something } 

Updated:

For iteration without creating intermediate elements, you can try:

 for (i <- 0 until xs.length) ... //xs(i) & ys(i) to access element 

or simply

 for (i <- xs.indices) ... 
+2
source
 for { i <- 0 until Math.min(list1.size, list2.size) } yield list1(i) + list2(i) 

Or something like one that checks borders, etc.

+1
source

I would do something like this:

 aList.indices foreach { i => val (aListItem, bListItem) = (aList(i), bList(i)) // do something with items } 
+1
source

Source: https://habr.com/ru/post/982011/


All Articles