I use hasoop map to reduce, and I want to calculate two files. My first iteration of Map / Reduce gives me a file with a pair identification number like this:
A 30 D 20
My goal is to use this identifier from a file to link to another file and have a different output from the trio: ID, Number, Name, for example:
A ABC 30 D EFGH 20
But I'm not sure if using Map Reduce is the best way to do this. Would it be better, for example, to use File Reader to read the second input file and get the name by ID? Or can I do this using the Zoom out card?
If so, I'm trying to figure out how to do this. I tried the MultipleInput solution:
MultipleInputs.addInputPath(job2, new Path(args[1]+"-tmp"), TextInputFormat.class, FlightsByCarrierMapper2.class); MultipleInputs.addInputPath(job2, new Path("inputplanes"), TextInputFormat.class, FlightsModeMapper.class);
But I canโt come up with any solution to combine the two and get the desired result. The way I am talking now just gives me a list similar to this example:
A ABC A 30 B ABCD C ABCDEF D EFGH D 20
After my last decrease, I get the following:
N125DL 767-332 N125DL 7 , N126AT 737-76N N126AT 19 , N126DL 767-332 N126DL 1 , N127DL 767-332 N127DL 7 , N128DL 767-332 N128DL 3
I want this: N127DL 7 767-332. And also, I do not want those that do not fit.
And this is my abbreviation class:
Public class FlightByCarrierReducer2 extends gearbox {
String merge = ""; protected void reduce(Text token, Iterable<Text> values, Context context) throws IOException, InterruptedException { int i = 0; for(Text value:values) { if(i == 0){ merge = value.toString()+","; } else{ merge += value.toString(); } i++; } context.write(token, new Text(merge)); }
}
Update:
http://stat-computing.org/dataexpo/2009/the-data.html , this is an example that I use.
I try: TailNum and Canceled, which (1 or 0) gets the model name corresponding to TailNum. My model file has TailNumb, Model and other things. My current output:
N193JB ERJ 190-100 IGW
N194DN 767-332
N19503 EMB-135ER
N19554 EMB-145LR
N195DN 767-332
N195DN 2
First comes the key, the second - the model, keys that have canceled flights, apperas below the model.
And I would like a trio of Key, Model Number of Canceled, because I want the amount of revocation per model