I'm trying to come up with a DateTime-based partition key strategy that does not lead to a bottleneck that can only be written to Append, which is often described in best practice guidelines.
Basically, if you share something like YYYY-MM-DD, all of your recordings for a given day will end up being the same section, which will decrease recording performance.
Ideally, a partition key should even distribute entries to as many partitions as possible.
To achieve this, while still basing the key on a DateTime value, I need to come up with a way to assign what constitutes a bucket of dateline values, where the number of buckets is a predefined number per time interval - say, 50 per day. The purpose of the data row in the bucket should be like possible more random, but always the same for a given value. The reason for this is because I need to be able to always get the correct section, given the original DateTime value. In other words, it looks like a hash.
Finally, and critically, I need a partition key that is consistent at some aggregate level. Thus, although the DateTime values ββfor a given interval, say, 1 day, will be randomly distributed between X partition keys, all partition keys for that day will be between the range query. This would allow me to query all the rows for my aggregate interval, and then sort them by DateTime to get the correct order.
Thoughts? This should be a pretty well-known problem that has already been resolved.
source share