Are higher-order functions for collections guaranteed to be executed sequentially?

In another question, the user suggested writing code similar to this:

def list = ['a', 'b', 'c', 'd'] def i = 0; assert list.collect { [i++] } == [0, 1, 2, 3] 

Such code in other languages ​​is considered bad practice because the contents of the collection change the state of its context (here it changes the value of i ). In other words, closure has side effects.

Such higher-order functions should be able to trigger the closure in parallel and reassemble it in a new list. If the processing in the closure is lengthy, intensive operations with the CPU, it may be worthwhile to execute them in separate threads. It would be easy to modify collect to use the ExecutorCompletionService to achieve this, but it would break the above code.

Another example of a problem is that, for some reason, collect looks at the collection, say, in reverse order, and in this case the result will be [3, 2, 1, 0] . Note that in this case the list was not returned, 0 is really the result of applying the closure to 'd'!

Interestingly, these functions are documented using "Iterating through this collection" in the Collection JavaDoc , which assumes that the iteration is sequential.

Does the groovy specification define the order in which higher-order functions, such as collect or each executed ? Is the above code broken, or is this normal?

+4
source share
2 answers

I don't like explicit external variables referenced in my closures for the reasons stated above.

In fact, the fewer variables I have to determine, the happier I am :-)

For possibly parallel things, always use code to wrap it with some level of GPars superiority if it turns out to be too large for a single thread to process. To do this, as you say, you want as little variability as possible and try to completely avoid side effects (such as the external counter template above)

Regarding the question itself, if we take collect as an example function and look at the source code , we will see that a a Object ( Collection and Map are executed similarly with slight differences as to how the Iterator refers), it InvokerHelper.asIterator(self) along InvokerHelper.asIterator(self) , adding the result of each close call to the resulting list.

InvokerHelper.asIterator (again the source here ) basically calls the iterator() method on the object passed to.

So for Lists , etc. it will iterate over the objects in the order specified by the iterator.

Therefore, you can create your own class that follows the Iterable interface (no need to implement Iterable , though, thanks for duck printing) and determine how the reassembly will be performed.

I think by asking about the Groovy specification, although this answer may not be what you want, but I don't think there is an answer. Groovy never had a "full" specification (indeed, this is a point about Groovy that some people don't like ).

+3
source

I think that saving the side effects passed to collect or findAll is a good idea in general, not only to reduce complexity, but make the code more friendly to parallel if parallel execution is needed in the future.

But in the case of each it makes no sense to maintain the free effect of the function, since it will do nothing (in fact, the only purpose of this method is to replace act as a for-each loop). The Groovy documentation provides examples of using each (and its variants eachWithIndex and reverseEach ), which require execution to determine.

Now, from a pragmatic point of view, I think it is sometimes useful to use functions with some side effects in methods like collect . For example, to convert a list to [index, value] a transpose and range pairs, you can use

 def list = ['a', 'b', 'c'] def enumerated = [0..<list.size(), list].transpose() assert enumerated == [[0,'a'], [1,'b'], [2,'c']] 

Or even inject

 def enumerated = list.inject([]) { acc, val -> acc << [acc.size(), val] } 

But a collect and the counter also does the trick, and I think the result is the most readable:

 def n = 0, enumerated = list.collect{ [n++, it] } 

Now this example does not make sense if Groovy provided collect and similar methods with the index-value-param function (see Jira issue ), but this shows that sometimes practicality exceeds the purity of IMO :)

+1
source

Source: https://habr.com/ru/post/1380747/


All Articles