C ++ Streaming Program and Shared Libraries

I have a C ++ program that I am trying to run as a streaming job on hadoop (it only has mappers, without reducers). Although a simple C ++ program works correctly. Another C ++ program, which is associated with a large number of shared libraries, does not work in the grid. ldd in this C ++ program shows the following: (it uses many third-party libraries such as opencv and boost_serialization)

/usr/local/lib/libboost_serialization.so.1.48.0 /usr/local/lib/libfftw3f.so.3 /usr/local/lib/libconfig++.so.9 /usr/local/lib/liblog4cpp.so.4 /usr/local/lib/libopencv_core.so.2.3 /usr/local/lib/libopencv_contrib.so.2.3 

I think because these shared libraries are not installed on the data nodes, and this does not work. I tried putting these libraries in a tarball and pointed this to a streaming job using the -archives (Distributed cache) option. This didn't work either (I'm not sure if the contents of the tarball were installed in the appropriate directory on the data nodes).

Any idea how to do this?

+6
source share
1 answer

Compile your C ++ program statically. Primarily:

 g++ -o <progra> -static <object-files> 

This will create a binary file that has no dependencies. It will be bulky (run strip on it!), But if it runs continuously, you shouldn't have a problem.

0
source

Source: https://habr.com/ru/post/907037/


All Articles