Several applications use the same Spark artists (or their cache)

We have a web application that connects to a Spark cluster to run some calculations. It also caches a large amount of data in the Spark artist cache.

To meet the high availability requirements, we need to run 2 instances of our web application on different hosts. Doing this simple will mean that the second application launches a different set of artists who initialize their own huge cache, completely identical to the first application.

Ideally, we would like to reuse the cache in Spark for the needs of all instances of our applications.

I am aware of the possibility of using Tachyon to externalize the performers cache. Other options are being studied.

Is there a way to allow multiple instances of the same application to connect to the same Spark artist set?

+4
source share

Source: https://habr.com/ru/post/1570471/


All Articles