Understanding when to use stateful services and when to rely on external persistence in Azure Service Fabric

Question

Understanding when to use stateful services and when to rely on external persistence in Azure Service Fabric

I spent evenings evaluating the Azure Service Fabric as a replacement for our current WebApps / CloudServices stack, and I feel a little unsure of how to decide when services / actors with state should be stateful and when they should be non-core with External persistent state (Azure SQL, Azure Storage, and DocumentDB). I know that this is a fairly new product (at least for the general public), so there probably isn’t much good practice regarding this, but I read most of the documentation provided by Microsoft that couldn’t find a specific answer for this.

The current area of concern that I am approaching is our event store; parts of our applications are event-driven and CQRS-based, and I am evaluating how to port this event store to the Service Fabric platform. The event store will contain many time series of data, and since this is our only source of truth for the data stored there, it must be sequential, replicated and stored in some form of long-term storage.

One of the ways I've looked at is that it has the status of "EventStream". Each instance of the aggregate using the event source stores its events in an isolated stream. This means that a state actor can track all events for his own stream, and I would fulfill my requirements regarding how the data is stored (transactional, replicated, and long-lived). Nevertheless, some flows can grow very large (hundreds of thousands, if not millions) of events, and it is here that I begin to doubt. I think having an actor with a lot of states will affect system performance when these large data models need to be serialized or deserialized from disk.

Another option is to keep these members stateless and force them to simply read their data from some external repository such as Azure SQL, or simply use services without tolerance instead of participants.

Basically, when is the number of states for the actor / service "too many" and you should start considering other ways to handle the state?

Also, this section in Fabric Actors Service Template Designer: some anti-templates leave me a little puzzled:

Think of Azure Service Fabric videos as a transactional system. Azure Service Fabric Actors is not a two-phase commit system that offers ACID. If we do not realize the optional constancy, and the machine on which the actor is working dies, the current state will go with it. The actor will advance on the other node very quickly, but if we do not fulfill the persistence of support, the state will disappear. However, between attempts to reuse, duplicate filtering and / or idempotent construction, you can achieve a high level of reliability and consistency.

What does it mean "if we do not realize optional tenacity" here? I had the impression that as long as your state-modifying transaction succeeded, your data was stored in reliable storage and replicated, at least to a subset of the replicas. This paragraph leaves me wondering if there are situations when the state of my actors / services will be lost, and if this is what I need for myself. The impression I got from the state model in other parts of the documentation seems to counteract this statement.

+47

azure azure-service-fabric

Trond Nordheim May 05 '15 at 11:13

source share

3 answers

I know this was answered, but recently I found myself in the same predicament with the CQRS / ES system and this is how I did it

Each unit was an actor in which only the current state was stored.
As a team, the unit will make a state change and raise an event.
Events themselves are stored in DocDb.
When activated, AggregateActor instances read events from DocDb if they are available to recreate the state. This is obvious only once to activate the actor. This concerned the case when an actor instance is transferred from one node to another.

+3

Raghu May 25 '16 at 5:31

source share

To answer the @Trond sedcondary question, which is, " What does it mean" if we don't implement the optional persistence "here?"

An actor is always a state-supported service, and its state can be configured using an actor class attribute to work in one of three modes:

persisted. The state is replicated to all replica instances, and it is also written to disk. This state is maintained even if all replicas are closed.
Volatile. The state is replicated to all replica instances, only in memory. This means that as long as one replica instance is alive, the state is maintained. But when all replicas, the closing state is lost and cannot be restored after they are restarted.
No perseverance. The state is not replicated to another replica instance and disk. This provides the least state of protection.

A full discussion of the topic can be found in the Microsoft documentation.

0

Phillip Ngan Nov 04 '17 at 4:00

source share

clca · Accepted Answer · 2015-05-06 22:14

One of the options that you have is to save "some" states in the actor (say, what can be considered hot data that should be quickly accessible) and store everything else in a "traditional" storage infrastructure, such as SQL Azure, DocDB, .... It is difficult to have a general rule about a too large local state, but maybe it helps to think about hot and cold data. Reliable actors also offer the ability to customize StateProvider, so you can also consider introducing a custom StateProvider (by introducing IActorStateProvider) with specific policies that need to be increased taking into account the requirements that you have regarding data quantity, latency, reliability, etc. (note: the documentation is still very minimal in the StateProvider interface, but we can post some sample code if that is what you want to continue).

About anti-patterns: a note more about transactions between multiple participants. Reliable actors provide a complete guarantee of the reliability of data within the boundaries of the actor. Due to the distributed and loosely coupled nature of the Actor model, the implementation of transactions involving multiple participants is not a trivial task. If "distributed" transactions are a strong requirement, the Trusted Services programming model is probably the best fit.

Understanding when to use stateful services and when to rely on external persistence in Azure Service Fabric

More articles: