More on this answer :
- Should regional services always work together? DataNodes in distributed clusters if you want decent performance. "
I'm not sure how anyone will interpret this term nearby, so try to be more specific:
- What makes any physical server an XYZ server is that it runs a program called daemons (think of a program that runs forever for background processing);
- What the file server does is that it works with the daemon file;
- What the web server does is that it starts the daemon web service; AND
- What makes the "data node" server so that it runs a demo file HDFS.
- What makes the "region" server then consists in the fact that it works with the HBase support server (program);
So, in all Hadoop distributions (for example, Cloudera, MAPR, Hortonworks, others), the common best practice is that for HBase, "RegionServers" are "shared" with "DataNodeServers".
This means that on the real slave servers (datanode) that form the HDFS cluster, each of them works with the daemon HDFS service (program) and , they also work with the HBase support area (program)!
Thus, we provide locality - parallel processing and storage of data on all individual nodes in the HDFS cluster with no "movement" of giant loads of large data from storage locations to "processing" locations. Locality is vital to the success of the Hadoop cluster, so the HBase domain servers (the data nodes that the HBase daemon runs on) must also do all their processing (put / getting / scan) on the node data containing the HFiles that make up the HPegions that make up The HTables that make up HBases (Hadoop-dataBases) ....
Thus, servers (virtual machines or physical ones in Windows, Linux, ..) can run several daemons at the same time, often they run dozens of them regularly.
source share