Error creating Spark DynamoDB hive table

In trying to save a DataFrame in a DynamoDB table, I tried to create a DynamDB Hive table. I use Python / Pyspark and I run the Spark application with --jars /usr/share/aws/emr/ddb/lib/emr-ddb-hive.jar .

 sqlContext = HiveContext(sc) sqlContext.sql('CREATE EXTERNAL TABLE ddb (col1 string) \ STORED BY "org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler" \ TBLPROPERTIES ("dynamodb.table.name" = "table1", "dynamodb.column.mapping" = "col1:col1")') 

However, I get the following error:

 INFO DDLTask: Use StorageHandler-supplied org.apache.hadoop.hive.dynamodb.DynamoDBSerDe for table default.ddb ERROR DDLTask: java.lang.NoSuchMethodError: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.initSerdeParams(Lorg/apache/hadoop/conf/Configuration;Ljava/util/Properties;Ljava/lang/String;)Lorg/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe$SerDeParameters; at org.apache.hadoop.hive.dynamodb.DynamoDBSerDe.initialize(DynamoDBSerDe.java:51) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:527) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:391) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:258) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:605) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:694) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4135) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) .... 

I am using EMR 4.3 configuration (all applications). This may be due to this hive problem .

Any ways to get around this? Thanx!

+5
source share

Source: https://habr.com/ru/post/1243517/


All Articles