It depends on your view in the state store.
In Kafka threads, the state is divided, and therefore each instance contains a part of the general state of the application. For example, when using the stateful DSL statement, a local instance of RocksDB is used to store its state. Thus, in this respect, the state is local.
On the other hand, all state changes are recorded in the Kafka theme. This section does not "live" on the application host, but in the Kafka cluster and consists of several sections and can be replicated. In the event of an error, this change topic is used to recreate the state of the failed instance in another instance that is not already running. Thus, since the change log is accessible to all instances of the application, it can also be considered global.
Keep in mind that the change log is the truth of the state of the application, and local storage is basically the cache of state fragments.
In addition, in the WordCount example, the record stream (data stream) is divided into words, so that counting of one word will be supported by one instance (and different instances support counting for different words).
For an architectural overview, I recommend http://docs.confluent.io/current/streams/architecture.html
Also this blog should be interesting http://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/
source share