C ++ Iterators and Loop Optimization

I see a lot of C ++ code that looks like this:

for( const_iterator it = list.begin(), const_iterator ite = list.end(); it != ite; ++it) 

Unlike the more concise version:

 for( const_iterator it = list.begin(); it != list.end(); ++it) 

Will there be a speed difference between the two agreements? Naively, the first one will be a little faster, since list.end () is called only once. But since the iterator is const, it looks like the compiler will pull this test out of the loop, creating an equivalent assembly for both.

+46
c ++ optimization compiler-construction iterator coding-style
Apr 28 '09 at 2:16
source share
11 answers

I just mentioned for the record that the C ++ standard provides a call to begin() and end() for any type of container (be it vector , list , map , etc.) should only take constant time . In practice, these calls will almost certainly be included in one pointer comparison if you compile with optimizations enabled.

Please note that this warranty is not necessarily valid for additional β€œcontainers” supplied by the supplier, which in fact do not comply with the formal requirements for the container described in chapter 23 of the standard (for example, a slist list).

+29
Apr 28 '09 at 11:24
source share

However, the two versions do not match. In the second version, it compares the iterator with list.end() each time, and what list.end() can evaluate can change during the cycle. Now, of course, you cannot change list through const_iterator it ; but nothing prevents code inside the loop from directly accessing list methods and mutating it, which can (depending on the list data structure) change the final iterator. Therefore, in some cases, it may be incorrect to store the final iterator in advance, because by the time you get to it, it will no longer be the right final iterator.

+43
Apr 28 '09 at 2:50
source share

The first one will probably almost always be faster, but if you think it will matter, always profile first to find out which is faster and how much.

The compiler will probably be able to insert an end() call in both cases, although if end() is complex enough, it may not include it. However, a key optimization is whether the compiler can execute loop movement with invariant code . I would say that in most cases the compiler cannot be sure that the end() value will not change during the iteration of the loop, in which case it has no choice but to call end() after each iteration.

+11
Apr 28 '09 at 2:22
source share

I would choose the option that is the most concise and readable. Do not try to guess about the compiler and its optimizations. Remember that the vast majority of your code will not affect the overall performance at all, therefore, only if it is in the section of code that is critical for performance, you should spend time profiling it and choose the appropriate effective representation of the source.

With a specific reference to your example, the first version makes a copy of the end() tetra, invoking any code executed for the copy constructor of the iterator object. STL containers usually contain built-in end() functions, so the compiler has many options for optimizing the second version, even if you are not trying to help it. Which one is better? Measure them.

+8
Apr 28 '09 at 2:32
source share

You can make the first version shorter and get the best of both:

 for( const_iterator it = list.begin(), ite = list.end(); it != ite; ++it) 

PS Iterators are not const; they are iterators for a const reference. There is a big difference.

+6
Apr 28 '09 at 2:46 april
source share

Consider the following example:

 for (const_iterator it = list.begin(); it != list.end(); ++list) { if (moonFull()) it = insert_stuff(list); else it = erase_stuff(list); } 

in this case you need to call list.end () in a loop, and the compiler is not going to optimize this.

Other cases where the compiler can prove that end () always returns the same value, optimization can be performed.

If we are talking about STL containers, then I think that any good compiler can optimize multiple calls () if programming logic does not require multiple calls (). However, if you have a custom container, and the end () implementation is not included in the same translation unit, then optimization should happen during the connection. I know very little about link time optimization, but I would bet that most linkers would not do this optimization.

+6
Apr 28 '09 at 4:43
source share

And, people seem to make guesses. Open your code in the debugger and you will see that calls to begin (), end (), etc. Everything is optimized. No need to use version 1. Tested with Visual C ++ fullopt compiler.

+4
Apr 28 '09 at 3:42
source share

The compiler can optimize the second in the first, but it is assumed that these two are equivalent, i.e. end () is actually persistent. A slightly more problematic problem is that the compiler may not infer that the final iterator is constant due to possible overlap. However, assuming the end () call is inline, the difference is just loading memory.

Please note that this assumes that the optimizer is enabled. If the optimizer is not enabled, as is often done in debug builds, then the second formulation will include N-1 more function calls. In current versions of Visual C ++, debug builds will also be subject to extra hits due to checks on proog / epilog functions and heavier debug iterators. Therefore, in heavy STL code, by default in the first case, code execution in debug builds may be disproportionately slow.

Inserting and deleting inside a loop is an opportunity, as others have pointed out, but with this loop style, I find this unlikely. First, node containers β€” list, set, map β€” do not invalidate end () for any operation. Secondly, iterator increment often needs to be moved around in a loop to avoid invalidation problems:

    // assuming list - cannot cache end () for vector
    iterator it (c.begin ()), end (c.end ());
    while (it! = end) {
        if (should_remove (* it))
            it = c.erase (it);
        else
            ++ it;
    } 

Thus, I am considering a loop that claims to call end () for mutate-in-loop reasons and still has ++ in the loop header to be suspicious.

+4
Apr 28 '09 at 10:30
source share

I always preferred the first one. Although with built-in functions, compiler optimization and a relatively smaller container size (in my case, this is usually not more than 20-25 elements), it really does not really matter in terms of performance.

 const_iterator it = list.begin(); const_iterator endIt = list.end(); for(; it != endIt ; ++it) {//do something } 

But lately, I have been using more std :: for_each wherever possible. Its an optimized loop that helps make the code more readable than the other two.

 std::for_each(list.begin(), list.end(), Functor()); 

I will use the loop only when std::for_each cannot be used. (for example: std::for_each does not allow you to break the loop unless an exception is thrown).

+1
Apr 28 '09 at 2:58
source share
  • Try it under stress and see if you often use this code ***.
    If not, it does not matter.

  • If you are, look at the disassembly or one step.
    How can you determine which is faster.

You have to be careful with these iterators.
They can be optimized into good machine code, but quite often they do not and become hours.

** (Where "in" means actually in it or is called from it.)

*** (where β€œoften” means a significant percentage of the time.)

ADDED: don't just look how many times per second the code is executed. It can be 1000 times per second and still use less than 1% of the time.

Do not waste time, how much time is required. This can take a millisecond and use less than 1% of the time.

You can multiply two to get a better idea, but this only works if they are not too skewed.

Sampling the call stack will tell you if it uses a high enough percentage of the time.

+1
Apr 28 '09 at 11:41
source share

Theoretically, the compiler can optimize the second version in the first (assuming the container does not change during the loop, obviously).

In practice, I found several similar cases when profiling critical code, when my compiler was completely unable to derive invariant calculations from the loop conditions. Therefore, although in most cases a slightly more concise version is fine, I do not rely on the compiler doing reasonable things with it for the case when I am really concerned about performance.

0
Apr 28 '09 at 2:45
source share



All Articles