Design and architecture for multiple concurrent users subscribing to a data feed

Here's the scenario: I have a โ€œdata transferโ€ - the REST / JSON service, which is periodically updated (let it be said every 10 seconds or so), and if the data set is changed, then all subscribed listeners must be updated.

Currently, it is implemented using a long HTTP poll, which is a technical aspect, but the main idea is that the clients do not bother the server, and the server does not bother the clients - unless there is nothing to worry about. When there is something new, all customers receive a notification immediately. The technology consists of Java / Tomcat7, async IO (asyncResponse).

I think it works fine: I can manage 10K concurrent sessions for ~ $ 0.07 per hour (AWS M3.Medium instance).

(Question: I think it works fine, but I would like to hear some control numbers for verification. Or, in other words, do you think this is a good hit for the dollar? Please share!)

If all my clients receive the same dataset (same JSON), is there a way I could optimize even more?

I'm thinking of IP V6 multicasting - this would minimize bandwidth consumption by orders of magnitude - but is it practical?

To support 1 million concurrent users, for example, if you have an update every 10 seconds, I will need to support 100K hits (or responses) per second. If the response size is 10K, bandwidth starts to become a big problem here: 10K * 100K * 60 * 60 * 24 โ†’ 86 Giga in 24 hours.

There is not a single focused question (besides IPv6) - I would like to hear your thoughts, experience and alternative approaches - I hate re-creating the wheel, and I am sure that the collective wisdom there far exceeds my own.

Thank.

+4
source share

Source: https://habr.com/ru/post/1546519/


All Articles