Multiple Iteration Performance

The hope of the effect of performance on one iteration over many iterations. I work in Python - I'm not sure if this affects the answer or not.

Try a series of data transformations for each item in the list.

def one_pass(my_list): for i in xrange(0, len(my_list)): my_list[i] = first_transformation(my_list[i]) my_list[i] = second_transformation(my_list[i]) my_list[i] = third_transformation(my_list[i]) return my_list def multi_pass(my_list): range_end = len(my_list) for i in xrange(0, range_end): my_list[i] = first_transformation(my_list[i]) for i in xrange(0, range_end): my_list[i] = second_transformation(my_list[i]) for i in xrange(0, range_end): my_list[i] = third_transformation(my_list[i]) return my_list 

Now, in addition to readability issues, strictly in terms of performance, is there a real advantage for one_pass over multi_pass? Assuming that most of the work is done in the conversion functions themselves, won't each iteration in multi_pass take about 1/3?

+4
source share
4 answers

The difference is how often the values ​​and the code you read are in the CPU cache .

If my_list elements are large but fit into the processor cache, the first version may be useful. On the other hand, if the (byte) conversion code is large, caching operations may be better than caching data.

Both versions are probably slower than the more readable ones:

 def simple(my_list): return [third_transformation(second_transformation(first_transformation(e))) for e in my_list] 

Terms of its implementation :

 one_pass: 0.839533090591 multi_pass: 0.840938806534 simple: 0.569097995758 
+5
source

Assuming you are considering a program that can easily be a single loop with multiple operations or multiple loops performing one operation each, then it never changes the computational complexity (for example, the O (n) algorithm is still equal to the O (n) path) .

One of the benefits of a one-pass approach is that you keep the “bookkeeping” cycle. Whether iterating mechanism increases and compares counters or retrieves the "next" pointers and checks for a null value or something else, you do it less when you do everything in one go. Assuming that your operations do some significant work (and that your cyclization mechanism is simple and simple, doesn’t fixate on an expensive generator or something like that), this work of the “book depository” will be overshadowed by the actual work of yours which do it definitely micro-optimization, which you should not do if you do not know that your program is too slow, and you have exhausted all the more significant optimizations available.

Another advantage may be that applying all of your operations to each iteration element before moving on to the next one tends to benefit better from the CPU cache, since each element can still be in the cache during subsequent operations on the same volume while using multiple passes makes it almost impossible (unless your collection fits in the cache). Python has so many references through dictionaries, although for each operation it may not be difficult to overflow the cache by reading hash buckets scattered throughout the memory space. Thus, it is still micro-optimization, but this analysis gives more chances (although not certainty) of a significant difference.

One of the advantages of a multi-pass connection is that if you need to maintain state between iterations of a loop, a single-pass approach will force you to maintain the state of all operations. This can damage the CPU cache (perhaps the state of each operation individually fits into the cache for the entire pass, but not the state of all operations collected together). In extreme cases, this effect can theoretically make the difference between installing the program in memory, and not (I came across this once in a program that chewed very large amounts of data). But in extreme cases, you know that you need to separate things, and not extreme cases - these are again microoptimizations that should not be done in advance.

Thus, performance usually favors a single pass with a small amount, but in some cases it may favor a single pass or multi-pass connection for a significant amount. The conclusion that you can draw from this is the same as the general recommendations that apply to all programming: start by writing the code in any way that is most clear and supported, and continues to access your program adequately. Only after you have a ready-made main program and , if it turns out to be “not fast enough”, and then measures the impact of the performance of various parts of your code to find out where to spend time.

The time taken to write single-pass or multi-pass algorithms for performance reasons will almost always be wasted. Therefore, if you do not have unlimited development time for you, you will get the “best” results from your overall development efforts (including with maximum efficiency), without worrying about this front and turning to it as necessary.

+2
source

You can get reduced cached misses in any version compared to another. It depends on what these conversion functions actually perform.

If these functions have a lot of code and work with different data sets (besides input and output), multi-pass might be better. Otherwise, a single pass is likely to be better, because the current list item is likely to remain cached, and loop operations are performed once instead of three times.

This is a case where comparing the actual runtime would be very useful.

+1
source

Personally, I would no doubt prefer the one_pass option. It definitely works better. You may be right that the difference will not be huge. Python optimized the xrange iterator very well, but you still do 3 times as many iterations as you need.

+1
source

Source: https://habr.com/ru/post/1446342/


All Articles