Most people use Presto on the Hadoop nodes that they already have. On Facebook, we typically run Presto on several nodes of a Hadoop cluster to distribute the load on the network.
As a rule, I would use standard industry coefficients for a new cluster: 2 cores and 2-4 gigabytes of memory per disk, as well as a 10 gigabyte network, if you can afford it. After you have several machines (4+), test using your queries to your data. This should be obvious if you need to adjust the relationship.
From the point of view of choosing hardware for a cluster from scratch, the following factors should be considered:
- The total size of the data will determine the number of drives you need. HDFS has a lot of overhead, so you need a lot of disks.
- The ratio of processor speed to disks depends on the ratio between hot data (the data you are working with) and cold data (archived data). If you are just starting up your data warehouse, you will need a lot of processors, as all the data will be new and hot. On the other hand, most physical disks can only deliver data so fast, so at some point more processors do not help.
- The ratio of processor speed to memory depends on the size of the aggregates and associations that you want to execute, and the amount of (hot) data that you want to cache. Presto currently requires the end results of aggregation and a hash table to be combined into memory on the same machine (we are actively working to address these limitations). If you have a large amount of memory, the OS will cache disk pages, which will significantly improve query performance.
In 2013, on Facebook, we launched Presto processes as follows:
- We launched our JVMs with a bunch of 16 GB to leave most of the memory available for OS buffers
- On the machines we ran Presto on, we did not complete MapReduce tasks.
- Most Presto computers had 16 real cores and used an affinity for the processor (eventually cgroups) to limit Presto to 12 cores (so that the Hadoop data node process and other functions could be easily executed).
- Most of the servers were in 10 gigabit networks, but we had one large old cluster using 1 gigabit (which worked fine).
- We used the same configuration for the coordinator and the workers.
Recently, we have launched the following:
- The machines had 256 GB of memory, and we used 200 GB of Java heap
- Most machines had 24-32 real kernels, and all kernels were allocated in Presto.
- There was only minimal local storage for the logs on the machines, and all the table data was deleted (in a proprietary distributed file system).
- Most servers had a 25 gigabit network connection to the matrix network.
- Coordinators and workers had similar configurations.
source share