I have a large amount of data that I need to store and be able to generate reports - each of which represents an event on a website (we are talking about 50 seconds per second, so obviously older data should be aggregated).
I evaluate the approaches to its implementation, it is obvious that it should be reliable and should be as simple as possible in scaling. It should also be possible to create reports from data in a flexible and efficient way.
I hope that some SOers have experience with such software and can make recommendations and / or point out pitfalls.
Ideally, I would like to deploy this on EC2.
source
share