EventStore - partial ordering of events and other functions

I am trying to evaluate EventStore as a reliable queuing mechanism internal to server software.

MSMQ is not suitable as an alternative, because it cannot support partial ordering, ordered messages in message threads. And because of its 4 MB message size limit (which could be overcome with partial ordering). SQL Service Broker supports partial ordering, but it hurts in the butt to set up and manage software.

Since the documentation for EventStore is admittedly sparse, can anyone with experience with EventStore with the following?

  • Does EventStore support transactional event handling, that is, if processing fails, can dequeue be rolled back?
  • With multiple readers in different threads, processes, or machines, EventStore ensures that each event is dispatched (?) To only one reader (at that time, possibly during a transaction).
  • Assuming this is possible, can events in different “conversations” be read simultaneously in any order, while messages in the same read separately and in order?
  • I read that EventStore is basically a "At least Once" delivery. Is it possible using specific storage providers to provide “exactly once” delivery?
  • How are poison events handled? Events that cause an error during processing. Perhaps the error is temporary and may be repeated. Perhaps it is ongoing and requires administrative intervention.
  • Is it possible to manually manipulate EventStore storage if necessary? Can this be done while other readers keep reading?

(I read that transactions on the storage engine are not required, but I still use the transaction language to indicate what replaces transactions at the EventStore level. If there are important functional consequences when switching from transactions to anything, please comment on them. I don’t you need to understand every aspect right away, you just have to hope to experiment more.)

+4
source share
2 answers

Although the EventStore could potentially be used to create a full-blown queue, it was never designed with this in mind. This means that there are many stubborn decisions that have led to the creation of a library that contradicts the requirements of your question.

For example, the “exactly once” delivery concept is something that messaging systems really don't support. Other things mentioned above, such as poisonous messages, are not really a problem because the EventStore is not connected to the message pipeline this way.

The problem you are trying to solve does not seem to be where the EventStore can help you. Therefore, I would recommend evaluating a full-blown message queue such as RabbitMQ.

Besides what you have on your messages, what makes them more than 4 MB? If you click on files or large binary streams, why not drag them into some highly accessible “global” storage (like Amazon S3) and then a pointer to the messages in the message?

+4
source

Some thoughts, although you seem to be satisfied with the first answer:

  • Partial sequencing is what happens if you track the causality of messages. There are ways to do this. A naive way to do this would be to simply keep a list of all the nodes in the distributed network that this message saw and added to this list when you start the message. I say a message, but I really refer to the messages in this conversation.

    Now this may work well with simple systems, but as you begin to get more complex systems, you may want Saga to keep track of what the message means.

    However, you may have a requirement for partial ordering on a single node recipient, and then the sagas in the template with the hub and knitting needles regarding the message flow will not help you. Then maybe you really need to add some logic to the transport you are using.

    One algorithm is called a vector clock , another similar is called a version of a vector . Here's an example implementation of a vector clock in Go - if you want to spend a couple of hours, I'm working on a tiny clock lib vector for F # - because the algorithm is really really simple. If you want to really read something that makes sense in this, I recommend this book - Elements of Distributed Systems . Chapters 2-3, 5 are good.

    Then you get a partial order of your conversations in a distributed system. The reason you cannot get this with the queue is because you have a network between your queue and your node, and if either the node or the process on the same node as the queue goes down because it has there is a message in transit, this message will be requested and reordered. The same if you press the message. You can get around this reordering problem by using 2PCs in turn for the consumer client, either you can sort the messages by their vector clock or you can sort them by the sequence identifier specified in the publish / send application, or you can sort them by some data, since it makes semantic meaning from the point of view of your consumer. The choice is yours.

  • As for the other requirements, such poisonous messages, you should see what the service bus gives you. I personally use MassTransit and it handles poisonous messages well. These are some modes of failure when consuming messages:

    • Serialization Error - You made a programming error. Correct the development process because this should not happen. If they still happen, just move them to the kernel queue - these messages are likely to get corrupted.
    • Unhandled exception from your code - you made a programming error. Again, your dev process is for verification. If you do not throw it away, so that your service bus moves the message to the poison queue at runtime.
    • You may have problems writing to the local consumer database - this is a problem with operations and should not happen in the code. Kill your process because right now you cannot do anything from code. Naigos or some other process monitor should tell your operating guys that something is not working, and they need to be fixed urgently, because if you cannot write your model, then your read model will probably not be able to service the requested her requests. A puppet or some other process monitor may restart your process after a while, and then you can go through the same steps, assuming that everything is in order, but this time do not start consuming a turn until you get a connection to the base data (this is what NHibernate does with its static initialization when it starts, for example) - and implement a retry policy, such as a circuit breaker on top of this retry logic.
  • Big events - make sure your queue API is too long byte array. ZeroMQ has multi-page posts. AMQP / RabbitMQ does not, so you have to cut them yourself, forcing you to order them again. Or you can just pass the binary fragment descriptor bit somewhere, where you can read it, like everyone else.

+4
source

Source: https://habr.com/ru/post/1382283/


All Articles