I have files like this in S3:
1-2013-08-22-22-something 2-2013-08-22-22-something etc
without srcPattern I can easily get all the files from the bucket, but I want to get a specific prefix, for example, all 1. I tried using srcPattern, but for some reason it did not collect any of the files.
My current command is:
elastic-mapreduce --jobflow $JOBFLOW --jar /home/hadoop/lib/emr-s3distcp-1.0.jar \ --args '--src,s3n://some-bucket/,--dest,hdfs:///hdfs-input,--srcPattern,[0-9]-.*' \ --step-name "copying over s3 files"
source share