Apache Beam: unable to find registrar for gs

Beam uses Google auto / value and auto / service tools.

I want to run the pipeline with the Dataflow runner, and the data is stored in Google cloud storage.

I added the dependencies:

<dependency> <groupId>org.apache.beam</groupId> <artifactId>beam-runners-google-cloud-dataflow-java</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.apache.beam</groupId> <artifactId>beam-sdks-java-extensions-google-cloud-platform-core</artifactId> <version>2.0.0</version> </dependency> 

I can start the pipeline from IntelliJ. But when jar is compiled through mvn package and works with java -jar , it throws an error:

 java.lang.IllegalStateException: Unable to find registrar for gs 

Tolhar is a package with maven-assembly-plugin . GcsFileSystemRegistrar class is in the bank.

+5
source share
2 answers

The problem is that you are building your fat. maven-assembly-plugin does not process files associated with ServiceLoader correctly. ServiceLoader relies on entries that are listed in META-INF/services/org.apache.beam.sdk.io.FileSystemRegistrar for each implementation, so Java knows how to find them.

The contents of META-INF/services/org.apache.beam.sdk.io.FileSystemRegistrar in your fatjar are most likely:

 org.apache.beam.sdk.io.LocalFileSystemRegistrar 

You need to have its list (and any other implementations you want):

 org.apache.beam.sdk.io.LocalFileSystemRegistrar org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystemRegistrar 

It is best to use a tool that understands these ServiceLoader requirements, such as the maven-shade-plugin , when it is configured to use the ServicesResourceTransformer to build its own fat.

+6
source

This seems like a build strategy problem, you have to accumulate / combine services for org.apache.beam.sdk.io.FileSystemRegistrar . Read more about a similar problem here .

+2
source

Source: https://habr.com/ru/post/1268547/


All Articles