(MySQL) will not be sufficient for my needs (in terms of scalability, speed, etc.)
Well ... it looks like facebook uses MySQL to a large extent. When it comes to NoSQL, I think it's not necessarily technology, it's data structures and algorithms.
What you are facing is a high bandwidth recording situation. One approach to high write throughput that works well for your problem is sharding : no matter how big the machine is and how efficient the software is, there will be a limit to the number of records that one machine can process. Sharding shares data across multiple servers, so you can write to different servers. For example, AM users write to server 1, NZ users to server 2.
Now the penalty is due to complexity, because he needs to balance, aggregations for all fragments can be complicated, you need to maintain several independent databases, etc.
This is a technological thing: MongoDB sharding is pretty simple because they support an automatic outline that does most of the unpleasant things for you. I do not think that you will need it at 500 inserts per second, but it is good to know this.
For circuit design, it is important to think about a shrapnel key that will be used to determine which shard is responsible for the document. This may depend on your traffic patterns. Suppose you have a user who works at a fair. Once a year, his site completely goes blank, but for 360 days it is one of the bottom traffic sites. Now, if you are fined by CustomerId , this particular user can lead to problems. On the other hand, if you plunge into VisitorId , you will have to hit every shard for a simple count() .
Part of the analysis depends a lot on the queries you want to support. The real slice & dice deal is quite complicated, I would say, in particular, if you want to support analytics in almost real time. A simpler approach is to limit user parameters and provide a small set of operations. They can also be cached, so you do not have to perform all aggregations every time.
In general, analytics can be complex, because there are many functions that require relationships. For example, for cohort analysis, you only need to consider log entries that were created by a specific group of users. The $in query will do the trick for smaller cohorts, but if we are talking about tens of thousands of users, this will not be done. You can choose only an arbitrary subset of users, because this should be statistically sufficient, but, of course, it depends on your specific requirements.
Map / Reduce is useful for analyzing large amounts of data: it will process on the server, and Map / Reduce also benefits from shards, because tasks can be processed individually by each shard. However, depending on the gazillion factor, these tasks will take some time.
I believe that there is information on this in the Ice block blog; they definitely have experience processing a lot of analytic data using MongoDB.