I performed a set of performance tests for 10,000,000 items, and I found that the results in different implementations are very different.
Can someone explain why creating Range.ByOne leads to performance that is better than a simple array of primitives, but converting the same range to a list leads to even worse performance than the worst case scenario?
Create 10,000,000 elements and print those that are 1,000,000 modules. The JVM size is always set to the minimum and maximum: -Xms? M-xmx? M
import java.util.concurrent.TimeUnit import java.util.concurrent.TimeUnit._ object LightAndFastRange extends App { def chrono[A](f: => A, timeUnit: TimeUnit = MILLISECONDS): (A,Long) = { val start = System.nanoTime() val result: A = f val end = System.nanoTime() (result, timeUnit.convert(end-start, NANOSECONDS)) } def millions(): List[Int] = (0 to 10000000).filter(_ % 1000000 == 0).toList val results = chrono(millions()) results._1.foreach(x => println ("x: " + x)) println("Time: " + results._2); }
It takes 141 milliseconds with a JVM size of 27 m
By comparison, converting to a list dramatically affects performance:
import java.util.concurrent.TimeUnit import java.util.concurrent.TimeUnit._ object LargeLinkedList extends App { def chrono[A](f: => A, timeUnit: TimeUnit = MILLISECONDS): (A,Long) = { val start = System.nanoTime() val result: A = f val end = System.nanoTime() (result, timeUnit.convert(end-start, NANOSECONDS)) } val results = chrono((0 to 10000000).toList.filter(_ % 1000000 == 0)) results._1.foreach(x => println ("x: " + x)) println("Time: " + results._2) }
Requires 8514-10896 ms from 460-455 m
Unlike this Java implementation, an array of primitives is used.
import static java.util.concurrent.TimeUnit.*; public class LargePrimitiveArray { public static void main(String[] args){ long start = System.nanoTime(); int[] elements = new int[10000000]; for(int i = 0; i < 10000000; i++){ elements[i] = i; } for(int i = 0; i < 10000000; i++){ if(elements[i] % 1000000 == 0) { System.out.println("x: " + elements[i]); } } long end = System.nanoTime(); System.out.println("Time: " + MILLISECONDS.convert(end-start, NANOSECONDS)); } }
116 ms required with 59 m JVM size
Java Integer List
import java.util.List; import java.util.ArrayList; import static java.util.concurrent.TimeUnit.*; public class LargeArrayList { public static void main(String[] args){ long start = System.nanoTime(); List<Integer> elements = new ArrayList<Integer>(); for(int i = 0; i < 10000000; i++){ elements.add(i); } for(Integer x: elements){ if(x % 1000000 == 0) { System.out.println("x: " + x); } } long end = System.nanoTime(); System.out.println("Time: " + MILLISECONDS.convert(end-start, NANOSECONDS)); }
}
3993 ms required with 283 m JVM size
My question is: why is the first example so effective and the second so much affected. I tried to create views, but could not reproduce the performance benefits of the range.
All tests performed on Mac OS X Snow Leopard, 64-bit Java server 6u26 Scala 2.9.1.final
EDIT:
to conclude, here is the actual implementation using LinkedList (which is a fairer comparison in terms of space than ArrayList, since, as rightly pointed out, the scala List are linked)
import java.util.List; import java.util.LinkedList; import static java.util.concurrent.TimeUnit.*; public class LargeLinkedList { public static void main(String[] args){ LargeLinkedList test = new LargeLinkedList(); long start = System.nanoTime(); List<Integer> elements = test.createElements(); test.findElementsToPrint(elements); long end = System.nanoTime(); System.out.println("Time: " + MILLISECONDS.convert(end-start, NANOSECONDS)); } private List<Integer> createElements(){ List<Integer> elements = new LinkedList<Integer>(); for(int i = 0; i < 10000000; i++){ elements.add(i); } return elements; } private void findElementsToPrint(List<Integer> elements){ for(Integer x: elements){ if(x % 1000000 == 0) { System.out.println("x: " + x); } } } }
It takes 3621-6749 ms from 480-460 mb. This is much more consistent with the performance of the second scala example.
finally LargeArrayBuffer
import collection.mutable.ArrayBuffer import java.util.concurrent.TimeUnit import java.util.concurrent.TimeUnit._ object LargeArrayBuffer extends App { def chrono[A](f: => A, timeUnit: TimeUnit = MILLISECONDS): (A,Long) = { val start = System.nanoTime() val result: A = f val end = System.nanoTime() (result, timeUnit.convert(end-start, NANOSECONDS)) } def millions(): List[Int] = { val size = 10000000 var items = new ArrayBuffer[Int](size) (0 to size).foreach (items += _) items.filter(_ % 1000000 == 0).toList } val results = chrono(millions()) results._1.foreach(x => println ("x: " + x)) println("Time: " + results._2); }
Taking about 2145 ms and 375 mb
Thanks so much for the answers.