I am creating a Spark application where I need to cache about 15 GB of CSV files. I read about the new UnifiedMemoryManagerintroduced in Spark 1.6, here:
https://0x0fff.com/spark-memory-management/
It also shows this picture: 
The author is different between User Memoryand User Memory Spark Memory(which is again divided into Storage and Execution Memory). As I understand it, Spark Memory is flexible for executing (shuffling, sorting, etc.) and storing (caching) content - if more memory is required, it can use it from another part (if it has not been fully used yet). Is this assumption true?
User memory is described as follows:
User memory. This is the memory pool that remains after Spark Memory is allocated, and you can use it the way you like. There you can store your own data structures that will be used in RDD transformations. For example, you can rewrite Spark aggregation using the mapPartitions transform hash table to run this aggregation, which consumes the so-called user memory. [...] And again, this is User Memory and it is completely up to you what will be stored in this RAM and how, Spark completely does not take into account what you are doing there and whether you observe this boundary or not. Failure to comply with this boundary in your code may result in an OOM error.
How can I access this part of memory or how does Spark control it?
( , , ..)? , spark.memory.storageFraction 1.0?
, ? , , ?
, , , RDD<MyOwnRepresentationClass> RDD<String>?
( Livy Client . Spark 1.6.2 Kryo).
JavaRDD<String> inputRDD = sc.textFile(inputFile);
JavaRDD<String> cachedRDD = inputRDD.filter(new Function<String, Boolean>() {
@Override
public Boolean call(String row) throws Exception {
String[] parts = row.split(";");
return hasFailure;
}
}).persist(StorageLevel.MEMORY_ONLY_SER());