How to delete files from / usr / lib / hadoop / lib before running EMR job on AMI 4.x?

I have a Hadoop job that uses version 1.5 of the commons-codec library. To get this work done on EMR AMI 3.x, I had to create a bootstrapping action that removed all earlier versions of the jar from the cluster so that they would not load. These are the relevant lines of this script:

sudo find / -name "commons-codec-1.2.jar" -exec rm -rf {} \;
sudo find / -name "commons-codec-1.3.jar" -exec rm -rf {} \;
sudo find / -name "commons-codec-1.4.jar" -exec rm -rf {} \;

It worked.

But now I am upgrading to AMI 4.x, and I have a problem: binary incompatible earlier versions of combo code files appear in the node subordinate file system after running the bootstrap script.

At the time bootstrap script is run, the directory /usr/lib/hadoop/libdoes not exist. But by the time the first stage of my work began, the directory exists and contains a file /usr/lib/hadoop/lib/commons-codec-1.4.jar. This file is called VerifyErrorbecause it is being downloaded instead of the later version of the public codec that I linked in my bank.

Is there any way to remove an incompatible jar before starting my work? Alternatively, is there something I can do to make sure that the correct version of Commons-codec (which I put in my jar) is loaded instead?

+4
source share

Source: https://habr.com/ru/post/1617216/


All Articles