Saving a snapshot of the latest version of each aggregate in the event store

We currently use an SQL-enabled event store (a typical two-table implementation), and some people on the team fear that although we only use the event store for writing, the situation may slow down a bit, so it was suggested instead of adding snapshots here and there actually maintain a fully consistent (with event streams) snapshot of each aggregate in its last state (in JSON format). All queries in the system will ultimately be executed on the read side with a typical SQL database, which is ultimately updated from the ES (write) side.

Having such a system will allow us to take advantage of the availability of Event Storage, while eliminating any possible performance problems. We do not currently use any time travel feature, although this is likely to be the end result.

Is this a good approach? Something about it left my inconvenience there. For example, if we need some kind of function for time travel, without having snapshots here and there in every cumulative stream of events, this will be a performance disaster. Of course, we can have both the most current snapshot per aggregate instance, and snapshots in the event streams.

If we decide to go this route, we need to update the snapshot for this aggregate transaction to update events in the same aggregate or just update the events and, ultimately, update the snapshot

What are the disadvantages of this approach? Has anyone tried something like this?

+5
source share
1 answer

You should probably run your own tests before adding unnecessary complexity to your system. We noticed some performance issues when thousands of events need to be requested and applied to restore the aggregate from the event stream, where JSON for deserializing objects is the biggest performance bottleneck. If each of your aggregates has only a few events (say, <100), you probably won't notice any significant differences in practice.

In most event stores, snapshots are recorded every n events / commits, say every 50-100 events, and the latest snapshots are requested on the assembly and missing events from the last snapshot are applied. If you also save all the old snapshots in the snapshot database, the time-shift function will be as fast as a regular request, and you will need a little more space to save, which is cheap now.

Snapshots should always be written from the original transaction (and can be generated in another thread), since it is not critical if the last snapshot is missing, but you want your business transaction to not be executed properly due to transaction errors snapshot recordings.

Depending on your usual uptime and data size, it might make sense to store snapshots in memory or a distributed cache / graid or in another database (not SQL).

+5
source

Source: https://habr.com/ru/post/1237676/


All Articles