From what I understand for high availability in hadoop, we need one Node name and one backup Node, shared network storage space (shared between two name nodes), at least 2 data nodes to start a cluster cluster.
Please suggest if I am missing any other service needed to create a hadoop environment.
What should be the system requirements for the Node name, since it only processes metadata (CPU I / O Intensive). The data we are crunching is mainly related to I / O intensity.
source
share