Impala - file not found error

I am using impala with flume as filestream.

The problem is that flume adds temporary files with the extension .tmp, and then when they are deleted, impala requests fail with the following message:

Backend 0: Could not open the HDFS file HDFS: // local: 8020 / user / hive /../ FlumeData.1420040201733.tmp Error (2): there is no such file or directory

How can I make impala ignore these tmp files, or not break them to write them, or write them to another directory?

Chimney configuration:

### Agent2 - Avro Source and File Channel, hdfs Sink ### # Name the components on this agent Agent2.sources = avro-source Agent2.channels = file-channel Agent2.sinks = hdfs-sink # Describe/configure Source Agent2.sources.avro-source.type = avro Agent2.sources.avro-source.hostname = 0.0.0.0 Agent2.sources.avro-source.port = 11111 Agent2.sources.avro-source.bind = 0.0.0.0 # Describe the sink Agent2.sinks.hdfs-sink.type = hdfs Agent2.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/user/hive/table/ Agent2.sinks.hdfs-sink.hdfs.rollInterval = 0 Agent2.sinks.hdfs-sink.hdfs.rollCount = 10000 Agent2.sinks.hdfs-sink.hdfs.fileType = DataStream #Use a channel which buffers events in file Agent2.channels.file-channel.type = file Agent2.channels.file-channel.checkpointDir = /home/ubutnu/flume/checkpoint/ Agent2.channels.file-channel.dataDirs = /home/ubuntu/flume/data/ # Bind the source and sink to the channel Agent2.sources.avro-source.channels = file-channel Agent2.sinks.hdfs-sink.channel = file-channel 
+5
source share
1 answer

I had this problem.

I updated hadoop and flume and this was resolved. (from cloudera hadoop cdh-5.2 to cdh-5.3)

Try upgrading - hadoop, flume or impala.

+3
source

Source: https://habr.com/ru/post/1210206/


All Articles