Intrinsically safe server does not clean files randomly

We execute SQL queries against the Spark EMR cluster using the Spark Thrift Server, and we see that when the SQL query (transferred to the Spark job) is completed, it moves the files located below it /mnt/yarn/usercache/root/appcache, it is not cleared. This No space left on deviceultimately triggers after running multiple queries.

If we stop Spark Thrift Server, the shuffle files will be cleared. Is it possible to start cleaning not only after the application is stopped, but also after each work? We tried to set the following parameters

yarn.nodemanager.localizer.cache.cleanup.interval-ms=6000
yarn.nodemanager.localizer.cache.target-size-mb=1000

but the files are still not cleared. Any idea why this is happening and how we can avoid it?

+4
source share

Source: https://habr.com/ru/post/1689007/


All Articles