sets is assigned a lambda, which in fact should not accept the input that you see by the way it is called. Typically, Lambdas behave like normal functions and therefore can be assigned to variables like g or sets . The definition of sets is surrounded by an additional set of parentheses for no apparent reason. You can ignore these external parades.
Lambdas can have the same types of positional, keyword, and default arguments that can perform common functions. Lambda sets has a default split option. This is a common idiom, ensuring that sets at each iteration of the loop gets a split value corresponding to that iteration, and not just one of the last iteration in all cases.
Without the default parameter, split will be evaluated in lambda based on the namespace at the time it is called. Once the loop completes, split in the namespace of the external functions will be only the last value it had for the loop.
The default parameters are evaluated immediately after the creation of the functional object. This means that the default value of split will be where it is in the iteration of the loop that creates it.
Your example is a bit misleading, as it discards all actual sets , except the last, which makes the default parameter for lambda pointless. Here is an example illustrating what happens if you save all lambdas. First specify the default parameter:
sets = []
for split in ['train', 'test']:
sets.append (lambda split = split: split)
print ([fn () for fn in sets])
I shortened the lambdas just to return their input parameter for illustration purposes. In this example, ['train', 'test'] will be printed, as expected.
If you do the same without the default parameter, there will be ['test', 'test'] instead:
sets = []
for split in ['train', 'test']:
sets.append (lambda: split)
print ([fn () for fn in sets])
This is because 'test' is a split value when all lambdas get evaluated.