Running Hadoop software on office computers (when they are idle)

Is there a project that helps set up a Hadoop cluster on desktop computers when they are idle?

I would like to experiment with Hadoop / MR / hbase, but do not have access to 5-10 computers. Working computers work without work after hours and are connected to each other using a high-speed connection. What's more, the data on these computers remains on our network, so there is no privacy problem.

For this to work, I need a fairly lightweight monitor that runs on every machine. When the computer is idle for X hours, it joins the cluster. If the user logs in, he must leave the cluster and return all the CPU / memory.

Is there something similar?

+6
source share
5 answers

You can use the task scheduler to determine the wait state, and then start / stop hasoop vm using the virtual window or vmplayer. Or you can write a powershell script that starts to stop based on resource usage.

+3
source

Hadoop is not a computational grid, but rather a data grid (see slide 9 in this presentation ). The fact is that with hadoop, data is distributed across the cluster and, therefore, the data must be stored on computers. The time taken to copy data on top / delete it when it is not idle is probably not worth it - you would be better off using hadoop in the cloud (amazon, Azure, etc.).

+1
source

I would use something like Condor: http://research.cs.wisc.edu/condor/

+1
source

You might want to take a look at the Virginia Tech Project http://www.wired.com/wiredenterprise/2012/05/project_moon/

+1
source

Look at solutions like NEREUS , which is a good MPC solution in Java

0
source

Source: https://habr.com/ru/post/913183/


All Articles