I have work on migrating beehive tables between a howop cluster. What I did was load the orc file from the source hadoop cluster, and then load the orc file into the target hdfs cluster using the following command.
hadoop fs -get
hadoop fs -put
The orc file in target clover chaos can be read as follows in a spark application:
df = sqlContext.sql('select * from orc.`path_to_where_orc_file_is`')
However, there is no corresponding table inside the hive in the hadoop target cluster.
Is there a way to create a table in hive from an orc file in hdfs without specifying ddl or schema? Because the orc file itself contains schema information.
The reason I ask this question is because the schema of the original hive table is quite nested and has many fields.
Currently, the only solution I can think of is to read these orc files into a spark and write them using the saveAsTable option as follows:
dfTable.write.format("orc").mode(SaveMode.Overwrite).saveAsTable("db1.test1")
source
share