Hadoop - attempt to force launch attempt on different node

I sent the task to a cluster of 4 hosts, I see that it was correctly distributed between 4 nodes, 1 map task on a node.

Later, one of the node failed.

I stopped tasktracker on the failed node, added the identifier of this node to exclude the file and an updated list of nodes with hasoop mradmin -refreshNodes . The failed node disappeared from the list of available nodes in the administration admin pages.

Then I started tasktracker again, updated the nodes with mradmin and noticed that the node appeared in the work tracking list again.

At runtime of a node, execution of an overridden map job hadoop on another node, so it started to run 2 map jobs. My cluster is unbalanced:

  • 2 nodes performed 1 task,
  • 1 node performed 2 tasks
  • and 1 node (the one I restarted) did not perform any tasks.

I killed the job with hasoop job -kill-task try_201308010141_0001_m_000000_1 , and it looks like it never starts again, so I see three nodes running 1 task, 1 node without any tasks and 1 waiting task in the list.

Am I missing something? What is the correct way to "move" a task from one node to another?

+4
source share
1 answer

Tasks save the list of blacklisted tasks (there is a global blacklist and one task). I think your new attempt does not start again at the end of a restartable task tracker.

You can try the commands:

hadoop job -unblacklist <jobid> <hostname> hadoop job -unblacklist-tracker <hostname> 

From http://doc.mapr.com/display/MapR/TaskTracker+Blacklisting

0
source

Source: https://habr.com/ru/post/1494630/


All Articles