Following your prompt, I looked at the source of the Pig to find out the answer.
Install pig.noSplitCombination in a Pig script does not work. In a pig.splitCombination script you need to use pig.splitCombination . Pig will then set pig.noSplitCombination to JobConf according to the value of pig.splitCombination .
If you want to install pig.noSplitCombination directly, you need to use the command line. For instance,
pig -Dpig.noSplitCombination=true -f foo.pig
The difference between the two methods is as follows: if you use the set statement in a Pig script, it is stored in the Pig properties. If you use -D , it is saved in the Hadoop configuration.
If you use set pig.noSplitCombination true , (pig.noSplitCombination, true) is stored in the Pig properties. But when Pig wants to initialize JobConf , it retrieves the value using pig.splitCombination from the Pig properties. Thus, your setting is not affected. Here are the source codes. The correct way to set pig.splitCombination false , as you mentioned.
If you use -Dpig.noSplitCombination=true , (pig.noSplitCombination, true) is stored in the Hadoop configuration. Because JobConf is copied from Configuration , the -D value is passed directly to JobConf .
Finally, PigInputFormat reads pig.noSplitCombination from JobConf to decide whether to use this combination. Here are the source codes.
source share