I have a stream of kinesia with two fragments that looks like this:
{
"StreamDescription": {
"StreamStatus": "ACTIVE",
"StreamName": "my-stream",
"Shards": [
{
"ShardId": "shardId-000000000001",
"HashKeyRange": {
"EndingHashKey": "17014118346046923173168730371587",
"StartingHashKey": "0"
},
{
"ShardId": "shardId-000000000002",
"HashKeyRange": {
"EndingHashKey": "340282366920938463463374607431768211455",
"StartingHashKey": "17014118346046923173168730371588"
},
]
}
}
The sender side sets up the partition, which is usually the UUID. It always falls into the 002 shard above, which makes the system unbalanced and therefore not scalable.
As a side note, kinesis uses md5sum to assign an entry, and then sends it to a shard containing the given hash in its range. In fact, when I tested it on the UUId that I used, they always fall into the same shard.
echo -n 80f6302fca1e48e590b09af84f3150d3 | md5sum
4527063413b015ade5c01d88595eec11
17014118346046923173168730371588 < 4527063413b015ade5c01d88595eec11 < 340282366920938463463374607431768211455
Any idea on how to solve this problem?
source
share