CQRS: Anonymous Project Notifications in ElasticSearch Reading Model

We have a microservice architecture and apply the CQRS pattern. The team sent to the microservice starts the application state change and emission of the corresponding event on our Kafka bus. We design these events in a readable model built using ElasticSearch.

So far so good.

Our microservices are ultimately consistent with each other. But at any given time they are not (required). Consequently, the events they dispatch do not always agree with each other.

In addition, to guarantee consistency between the change in the state of the application and the emission of the corresponding event, we save the new state and the corresponding event in the database in the same transaction (I know that we could use the sources of events and avoid the state in general). The asynchronous worker is then responsible for sending these events on the Kafka bus. This template ensures that at least one event will be dispatched for each state change (which is not a problem since our events are idempotent). However, since each microservice has its own event table and asynchronous worker, we cannot guarantee that events will be sent in the sequence in which the corresponding state changes occurred in their respective microservices.

EDIT: to clarify, each microservice has its own database, its own event table, and its own working one. A specific worker processes events in the order in which they were stored in the corresponding event table, but different workers in different event tables, i.e. For various microservices, do not give such a guarantee.

A problem occurs when projecting these incoherent or out-of-line events from different microservices into the same ElasticSearch document.

A specific example: imagine three different aggregates A, B and C (the aggregate in the sense of Driven Design), controlled by various microservices:

  • There is a connection between many and many between A and B. Collection A refers to the aggregate roots of B to which it is attached, but B does not know about its relationship with A. When B leaves, the control A of microservice A listens to the corresponding event and unbinds A to B.
  • Similarly, there is a many-to-many relationship between B and C. B knows about all related C-aggregates, but the converse is not true. When C is deleted, the microservice management service B listens for the corresponding event and unbinds B from C.
  • C has a property of "name".

One use case is to find through ElasticSearch all aggregates A that are bound to aggregate B, which, in turn, is associated with aggregate C with a specific name.

As explained above, individual event tables and workers can introduce variable delays between outbreaks of events from different microservices. The creation of A, B, and C and their binding can lead, for example, to the following sequence of events:

  • B created
  • B associated with C
  • C created with the name XYZ
  • Created
  • Associated with B

Another example of an event package: suppose that initially we have aggregates B and C, and two commands are issued simultaneously:

  • remove C
  • snap B to C

this can lead to events:

  • C deleted
  • B associated with C
  • B unrelated to C (in response to event 1)

Specifically, we had problems designing these events in an ElasticSearch document, because events sometimes refer to aggregates that no longer exist or do not yet exist. Any help would be appreciated.

+5
source share
1 answer

I don’t think that the problem you are raising is an exclusive part of the projection part of your system - it can also occur between microservices A, B and C.

Usually, the projector gets C created at the same time that B does. Only then B can bind to C, which makes it impossible to execute the specific order that you mentioned with the projector.

However, you are correct that messages may arrive in the wrong order if, for example, the network connection between B and C is much faster than between C and the projector.

I have never encountered such a problem, but several options come to my mind:

  • Do not use "foreign keys" at the level of the read model. Keep B with his reference to C, even if you know little about C at the moment. In other words, make B bound to C and C created commutative.

  • Add a causality identifier to your events. This allows the client to recognize and process messages without order. You can choose your own policy - reject, wait for causality to appear, try to process it anyway, etc. However, this is not the case to implement.

  • Messaging platforms can guarantee an order under certain conditions. You mentioned Kafka on the same topic and section. RabbitMQ, I think, has even stronger premises.

    I am not an expert in messaging, but it seems that the scenarios of exchanging between microservices, where possible, are limited. It seems to run counter to the current trend in a possible sequence, where we tend to support commutative operations (see CRDT) to ensure general order.

+2
source

Source: https://habr.com/ru/post/1273689/


All Articles