Accessing files in a resource directory in a JAR from an Apache Spark Streaming context

I have a Java application that I wrote as a Spark Streaming task that requires some text resources that I included in a jar in the resource directory (using the default Maven directory structure). With unit tests, I have no problem accessing these files, but when I run my program using spark-submit, I get a FileNotFoundException. How do I access files in the classpath in my JAR on startup using the spark-submit function?

The code I use to access my file looks something like this:

InputStream input; try { URL url = this.getClass().getClassLoader().getResource("my file"); if (url == null) { throw new IOException("file does not exist"); } String path = url.getPath(); input = new FileInputStream(path); } catch(IOException e) { throw new RuntimeException(e); } 

Thanks.

Please note that this is not a duplicate of reading the resource file from the jar (as suggested), because this code works when run locally. This only happens when starting in a Spark cluster.

+5
source share
1 answer

I fixed this by accessing the resource directory in a different (and much less stupid) way:

 input = MyClass.class.getResourceAsStream("/my file"); 
+2
source

Source: https://habr.com/ru/post/1258786/


All Articles