I used Hector, Astyanax and Thrift directly. I also used the Python client PyCassa.
The functions that I found important and differentiable were:
- Ease of use API
- Composite Column Support
- Connection pool
- Delay
- Documentation
One of the main problems is type validity. You want to be able to pass in longs, strings, byte [], etc. Both Hector and Astyanax solve this with Serializer objects. In Astyanax, you specify them up the chain, so you have to specify them less often. In Hector, the syntax is often very awkward and difficult to adapt if you change the scheme.
Since Python has dynamic types, PyCassa is much easier to handle. Since this is not an option for you, I will not talk much about it, but it was easier for me to use (of course), but also quite slowly.
Support for composite columns in Hector is very confusing. Astyanax has annotations to greatly simplify this.
As far as I know, the connection pool is the same for Hector and Astyanax. Both will avoid downed hosts and discover new ones added to the ring. Both of these features are critical to reliability and maintainability. Pelops seems to have these features, but I never tried.
The key difference between Astyanax and Hector is latency optimization. Astyanax has the ability to route read and write requests to a replica node, potentially avoiding the additional network hop. This can reduce the delay by a few milliseconds.
Astyanax finally had poor documentation, but now it has improved a lot.
Hector's only advantage that I see today is that it is much more widely used, so itβs probably less buggy. But Astyanax has a better feature set.
Richard Apr 16 '13 at 11:47 on 2013-04-16 11:47
source share