I have been running Kafka on Kubernetes for a long time without any major problem; however, I recently introduced the Cassandra pods cluster and started having problems with Kafka.
Although Cassandra does not use a page cache such as Kafka, it does write frequently to disk, which seems to affect the kernel cache.
I understand that Kubernetes pods manage memory resources through groups that can be configured by setting memory requests and restrictions in Kubernetes, but I noticed that using the Cassandra page cache can increase the number of page errors in my Kafka modules even if they don't seem to compete for resources (i.e. there is memory on their nodes).
In Kafka, more page errors result in more writes to disk, which impedes the benefits of serial I / O and reduces disk performance. If you use something like AWS EBS volumes, this will ultimately lead to depletion of your packet balance and, ultimately, to catastrophic failures in your cluster.
My question is, can I isolate page cache resources in Kubernetes or somehow tell the kernel that pages belonging to my Kafka modules should be kept in cache longer than in my Cassandra channels?
source
share