Using scala -native to process data in memory

I am wondering if scala -native can be used to perform large jobs in memory.

For example, imagine that you have a spark job that needs 150 GB of RAM, so you have to run 5x30GB performers in the spark cluster, as JVM garbage collectors will not catch up with a heap larger than this.

Imagine 99% of the data being processed Stringsin collections.

Do you think scala -native will help here? I mean, how is an alternative to Spark?

How does he relate to String? Does it also have this overhead because jvm treats it as a class?

What are the memory limits (heaps) of the GC like the classic 30 GB in the case of the JVM? Will I also have a limit like 30 GB?

Or is this a bad idea? Use scala -native to process data in memory. I guess scala-offheap is the best way to go.

+4
source share
2 answers

in-memory data processing is a precedent when a scala-native will shine compared to Scala in the JVM.

SN supports all types of memory allocations. Static allocation (you can define a global variable in C and return a pointer to it using the C function), stack allocation, dynamic allocation based on the dynamic allocation of C malloc / free and garbaged (Scala new).

8 char C String, 16- Java char , ++, @struct .

, , SN, 0.1 Java, Scala.

+1

- Scala , . , Scala Native ( BOEHM) , JVM, Scala Native .

0

Source: https://habr.com/ru/post/1653598/


All Articles