I am writing a map function using mrjob. My entry will be obtained from files in the directory on HDFS. File names contain small but important piece information that is missing from the files. Is there a way to find out (inside the map function) the name of the input file from which this key pair of keys comes from?
I am looking for the equivalent of this Java code:
FileSplit fileSplit = (FileSplit)reporter.getInputSplit(); String fileName = fileSplit.getPath().getName();
Thanks in advance!
source share