Hadoop - attempt to force launch attempt on different node

Question

Hadoop - attempt to force launch attempt on different node

I sent the task to a cluster of 4 hosts, I see that it was correctly distributed between 4 nodes, 1 map task on a node.

Later, one of the node failed.

I stopped tasktracker on the failed node, added the identifier of this node to exclude the file and an updated list of nodes with hasoop mradmin -refreshNodes . The failed node disappeared from the list of available nodes in the administration admin pages.

Then I started tasktracker again, updated the nodes with mradmin and noticed that the node appeared in the work tracking list again.

At runtime of a node, execution of an overridden map job hadoop on another node, so it started to run 2 map jobs. My cluster is unbalanced:

2 nodes performed 1 task,
1 node performed 2 tasks
and 1 node (the one I restarted) did not perform any tasks.

I killed the job with hasoop job -kill-task try_201308010141_0001_m_000000_1 , and it looks like it never starts again, so I see three nodes running 1 task, 1 node without any tasks and 1 waiting task in the list.

Am I missing something? What is the correct way to "move" a task from one node to another?

+4

hadoop

jdevelop Aug 1 '13 at 4:35

source share

1 answer

ALSimon · Answer 1 · 2014-12-23T16:44:26+0000

Tasks save the list of blacklisted tasks (there is a global blacklist and one task). I think your new attempt does not start again at the end of a restartable task tracker.

You can try the commands:

hadoop job -unblacklist <jobid> <hostname> hadoop job -unblacklist-tracker <hostname>

From http://doc.mapr.com/display/MapR/TaskTracker+Blacklisting

Hadoop - attempt to force launch attempt on different node

More articles: