I am using Cloudera Hadoop. I can run a simple mapreduce program, where I provide the file as input to the MapReduce program.
This file contains all the other files that will be processed using the mapper function.
But I was stuck at some point.
/folder1 - file1.txt - file2.txt - file3.txt
How can I specify the input path to MapReduce as "/folder1" so that it can start processing each file inside this directory?
Any ideas?
EDIT:
1) Intailly, I introduced inputFile.txt as input to the mapreduce program. It worked great.
>inputFile.txt file1.txt file2.txt file3.txt
2) But now instead of giving the input file, I want to provide the input directory as arg [0] on the command line.
hadoop jar ABC.jar /folder1 /output
source share