That in the document you are linking does not reflect the reality of what's in Tachyon as an open source project for releases, parts of this article only ever existed as research prototypes and were never fully integrated into Spark / Tachyon.
When you save data to the OFF_HEAP storage OFF_HEAP via rdd.persist(StorageLevel.OFF_HEAP) , it uses Tachyon to write this data to Tachyon's memory space as a file. This removes it from the Java heap, thereby giving Spark more heap memory to work with.
He currently does not write line information, so if your data is too large to fit into the configured sections of Tachyon clusters, parts of the RDD will be lost and your Spark jobs may fail.
source share