Akka is the only point of failure

Question

Akka is the only point of failure

I want to create a system that will not have a single point of failure. I got the impression that routers are a tool for this, but I'm not sure if it works as I expected. This is the entry point to my program:

object Main extends App{ val system = ActorSystem("mySys", ConfigFactory.load("application")) val router = system.actorOf( ClusterRouterPool(RoundRobinPool(0), ClusterRouterPoolSettings( totalInstances = 2, maxInstancesPerNode = 1, allowLocalRoutees = false, useRole = Some("testActor"))).props(Props[TestActor]), name = "testActors") }

And this is the code to start the remote ActorSystem (so that the router can deploy the TestActor code to the remote nodes):

 object TestActor extends App{ val system = ActorSystem("mySys", ConfigFactory.load("application").getConfig("testactor1")) case object PrintRouterPath }

I run this twice, once with testactor1 and once with testactor2 .

TestActor code:

 class TestActor extends Actor with ActorLogging{ implicit val ExecutionContext = context.dispatcher context.system.scheduler.schedule(10000 milliseconds, 30000 milliseconds,self, PrintRouterPath) override def receive: Receive = { case PrintRouterPath => log.info(s"router is on path ${context.parent}") } }

And application.conf

 akka{ actor { provider = "akka.cluster.ClusterActorRefProvider" } remote { log-remote-lifecycle-events = off netty.tcp { hostname = "127.0.0.1" port = 2552 } } cluster { seed-nodes = [ "akka.tcp:// mySys@127.0.0.1 :2552" "akka.tcp:// mySys@127.0.0.1 :2553" "akka.tcp:// mySys@127.0.0.1 :2554"] auto-down-unreachable-after = 20s } } testactor1{ akka{ actor { provider = "akka.cluster.ClusterActorRefProvider" } remote { log-remote-lifecycle-events = off netty.tcp { hostname = "127.0.0.1" port = 2554 } } cluster { roles.1 = "testActor" seed-nodes = [ "akka.tcp:// mySys@127.0.0.1 :2552" "akka.tcp:// mySys@127.0.0.1 :2553" "akka.tcp:// mySys@127.0.0.1 :2554"] auto-down-unreachable-after = 20s } } } testactor2{ akka{ actor { provider = "akka.cluster.ClusterActorRefProvider" } remote { log-remote-lifecycle-events = off netty.tcp { hostname = "127.0.0.1" port = 2553 } } cluster { roles.1 = "testActor" seed-nodes = [ "akka.tcp:// mySys@127.0.0.1 :2552" "akka.tcp:// mySys@127.0.0.1 :2553" "akka.tcp:// mySys@127.0.0.1 :2554"] auto-down-unreachable-after = 20s } } }

Now the problem is that when the process starting the router was killed, the participants executing the TestActor code TestActor not receive any messages (messages sent by the scheduler), I would expect the router to be deployed to another seed node in the cluster, and the participants will be restored. Is it possible? or is there another way to implement this thread and not have a single point of failure?

+5

scala akka akka-cluster

user_s Jan 21 '17 at 19:27

source share

2 answers

Stefano bonetti · Answer 1 · 2017-01-26T16:34:04+0000

I think that by deploying a router on only one node, you create a master-slave cluster, where the master is the only point of failure by definition.

From what I understand (looking at docs ), the router can be aware of the cluster in the sense that it can deploy (pool mode) or search (group mode) on the nodes of the cluster. The router itself will not respond to failure by spawning elsewhere in the cluster.

I believe that you have 2 options:

use multiple routers to make the system more resilient. Routers can be divided (group mode) or not (pool mode) between routers.
use the Cluster Singleton template, which allows you to create a master-slave configuration, where the master will automatically restart, generated in case of failure. Regarding your example, note that this behavior is achieved because each node uses an actor ( ClusterSingletonManager ). This actor has a development goal if the selected wizard needs to be updated and where. None of these logic are installed in the case of a router that supports a cluster like the one you installed.

In this activator example, you can find examples of several cluster settings.

gaston · Answer 2 · 2017-01-30T20:34:31+0000

i checked two approaches, first using your code with ClusterRouterPool As you said, when the process that starts the router is killed, TestActor will not receive any more messages. When reading the documentation and testing, if you change in application.conf :

 `auto-down-unreachable-after = 20s`

for this

 `auto-down-unreachable-after = off`

TestActor continues to receive messages, although the following message appears in the log (I don’t know how to put the log here, sorry):

[WARN] [01/30/2017 17: 20: 26.017] [mySys-akka.remote.default-remote-dispatcher-5] [akka.tcp: // mySys@127.0.0.1 : 2554 / system / endpointManager / reliableEndpointWriter- akka.tcp% 3A% 2F% 2FmySys% 40127.0.0.1% 3A2552-0] Association with the remote system [akka.tcp: // mySys@127.0.0.1 : 2552] failed, the address is now blocked for [5000] ms. Reason: [Association failure with [akka.tcp: // mySys @ 12.0.0.1: 2552]] Called: [Connection refused: /127.0.0.1: 2552] [INFO] [01/30/2017 17: 20: 29.860] [mySys-akka.actor.default-dispatcher-4] [akka.tcp: // mySys@127.0.0.1 : 2554 / remote / akka.tcp / mySys @ 127.0.0.1: 2552 / user / testActors / c1] The router is located on paths Actor [akka.tcp: // mySys @ 12.0.0.1: 2552 / user / testActors # -1120251475] [WARN] [01/30/2017 17: 20: 32.016] [mySys-akka.remote.default-remote- dispatcher-5]

And in case of restarting MainApp, the log works fine without warning or errors

MainApp Magazine:

[INFO] [01/30/2017 17: 23: 32.756] [mySys-akka.actor.default-dispatcher-2] [akka.cluster.Cluster (akka: // mySys)] Node cluster [akka.tcp: / / mySys@127.0.0.1 : 2552] - Welcome to [akka.tcp: // mySys@127.0.0.1 : 2554]

TestActor Log:

INFO] [01/30/2017 17:23:21.958] [mySys-akka.actor.default-dispatcher-14] [akka.cluster.Cluster (akka: // mySys)] Node cluster [akka.tcp: // mySys@127.0.0.1 : 2554] - A new incarnation of an existing participant [Member (address = akka.tcp: // mySys@127.0.0.1 : 2552, status = Up)] is trying to join. Existing ones will be removed from the cluster, and then the new member will be allowed to join. [INFO] [01/30/2017 17: 23: 21.959] [mySys-akka.actor.default-dispatcher-14] [akka.cluster.Cluster (akka: // mySys)] Node cluster [akka.tcp: / / mySys@127.0.0.1 : 2554] - marking is not available Node [akka.tcp: // mySys@127.0.0.1 : 2552] as [Down] [INFO] [01/30/2017 17: 23: 22.454] [mySys-akka. actor.default-dispatcher-2] [akka.cluster.Cluster (akka: // mySys)] Node cluster [akka.tcp: // mySys@127.0.0.1 : 2554] - The leader can fulfill his duties again [INFO] [01 / 30/2017 17: 23: 22.461] [mySys-akka.actor.default-dispatcher-2] [akka.cluster.Cluster (akka: // mySys)] Node cluster [akka.tcp: // mySys@127.0.0.1 : 2554] - The leader removes the unreachable Node [akka.tcp: // mySys@127.0.0.1 : 2552] [INFO] [01/30/2017 17: 23: 32.728] [mySys-akka.actor.default-dispatcher-4] [akka.cluster.Cluster (akka: // mySys)] Node cluster [akka.tcp: // mySys@127.0.0.1 : 2554] - Node [akka.tcp: // mySys@127.0.0.1 mySys@127.0.0.1 : 2552] - SAVE, roles [] [INFO] [01/30/2017 17: 23: 33.457] [mySys-akka.actor.default-dispatcher-14] [akka.cluster.Cluster (akka: // mySys)] Node cluster [akka.tcp: // mySys@127.0.0.1 : 2554] - The leader moves Node [akka.tcp: // mySys @ 12.0.0.1: 2552] to [Up] [INFO] [01/30 / 2017 17: 23: 37.925] [mySys-akka.actor.default-dispatcher-19] [akka.tcp: // mySys@127.0.0.1 : 2554 / remote / akka.tcp / mySys @ 127.0.0.1: 2552 / user / testActors / c1] The router is on the Actor path [akka.tcp: // mySys @ 12.0.0.1: 2552 / user / testActors # -630150507]

Another approach is to use ClusterRouterGroup , as routes are distributed between the nodes of the cluster

Group - a router that sends messages to a specified path using actor selection . Routes can be used in conjunction with routers operating on different nodes of the cluster. One example of use for this type of router is a service running on some base nodes of a cluster and used by routers running on the interface nodes of the cluster.
A pool is a router that creates routes as child entities and deploys them to remote nodes. Each router will have its own route instances. For example, if you run the router on 3 nodes of a 10-node cluster, you will get 30 routes if the router is configured to use one instance per node. Routes created by different routers will not be shared with routers. One example of use for this type of router is a single leader, which coordinates tasks and delegates actual work to routes running on other nodes in the cluster.

Main application

 object Main extends App { val system = ActorSystem("mySys", ConfigFactory.load("application.conf")) val routerGroup = system.actorOf( ClusterRouterGroup(RoundRobinGroup(Nil), ClusterRouterGroupSettings( totalInstances = 2, routeesPaths = List("/user/testActor"), allowLocalRoutees = false, useRole = Some("testActor"))).props(), name = "testActors") }

you must run TestActor on every remote node device

 object TestActor extends App{ val system = ActorSystem("mySys", ConfigFactory.load("application").getConfig("testactor1")) system.actorOf(Props[TestActor],"testActor") case object PrintRouterPath }

http://doc.akka.io/docs/akka/2.4/scala/cluster-usage.html#Router_with_Group_of_Routees

Route actors must start as early as possible when the actors system starts, because the router will try to use them as soon as the participant status is changed to "Up".

I hope this helps you

Akka is the only point of failure

More articles: