The cost of a virtual function in a closed loop

Question

The cost of a virtual function in a closed loop

I am in a situation where I have game objects that have a virtual function Update (). There are many game objects (currently a little over 7000), and the cycle causes an update for all of them (among other things). My colleague suggested that we should completely remove the virtual function. As you can imagine, this will require a lot of refactoring.

I saw this answer , but in my case profiling means I need to change a lot of code. Therefore, before I even start to think, I thought that I would ask here to express an opinion on whether refactoring is worth it in this case.

Note that I profiled the other parts of the cycle and tried to optimize those parts that take the longest. I suspect that calling a virtual function in this case is something I should not worry about, but I can’t be sure until the profile, and I won’t be able to profile until I change the code (which is a lot). Also note that some update features are very small, while others are more complex.

EDIT . There are several answers that give a lot of information, so anyone who stumbles on this question in the future will look at all the answers, not just the selected one.

+5

c ++ optimization virtual-functions

Samaursa Jul 6 '11 at 15:37

source share

4 answers

A virtual function call will not add much more than one indirect and hardly predictable spasmodic change. This means that you are usually down a single pipeline flow or about 20 cycles per virtual function. 7,000 of them are about 140,000 cycles, which should be negligible compared to your average refresh function. If this is not the case, say that most of your update functions are simply empty, you can consider placing objects suitable for updating in a separate list for this purpose.

Removing a virtual function will simply cause one of you to replace it with an identical but self-fulfilling system. This is the exact view of the place where the virtual function makes sense.

In comparison, 140,000 cycles is about 50 microseconds. This assumes P4 with a huge pipeline and always a full flush stream (which you usually don't get).

+10

dascandy Jul 6 '11 at 15:42

source share

Although this is not the same code and cannot be the same compiler as you, here is some reference data from a rather old test (bench ++ by Joe Orost):

 Test Name: F000005 Class Name: Style CPU Time: 7.70 nanoseconds plus or minus 0.385 Wall/CPU: 1.00 ratio. Iteration Count: 1677721600 Test Description: Time to test a global using a 10-way if/else if statement compare this test with F000006 Test Name: F000006 Class Name: Style CPU Time: 2.00 nanoseconds plus or minus 0.0999 Wall/CPU: 1.00 ratio. Iteration Count: 1677721600 Test Description: Time to test a global using a 10-way switch statement compare this test with F000005 Test Name: F000007 Class Name: Style CPU Time: 3.41 nanoseconds plus or minus 0.171 Wall/CPU: 1.00 ratio. Iteration Count: 1677721600 Test Description: Time to test a global using a 10-way sparse switch statement compare this test with F000005 and F000006 Test Name: F000008 Class Name: Style CPU Time: 2.20 nanoseconds plus or minus 0.110 Wall/CPU: 1.00 ratio. Iteration Count: 1677721600 Test Description: Time to test a global using a 10-way virtual function class compare this test with F000006

This specific result is to compile with the 64-bit version of VC ++ 9.0 (VS 2008), but it is quite similar to what I saw from other recent compilers. The bottom line is that a virtual function is faster than most obvious alternatives and very close to the same speed as the only one that beats it (in fact, both are equal within the margin of error). This, however, depends on dense values, as you can see in F00007, if the values are sparse, the switch statement produces code that is slower than calling a virtual function.

Bottom line: calling a virtual function is probably the wrong place to look. Refactored code can easily run slower, and even in the best case, it probably won't get enough to notice or care for.

+8

Jerry Coffin Jul 6 '11 at 16:38

source share

another test with virtual, built-in and direct calls that you can find here [enter the link here] [1] Virtual functions and performance - C ++

+2

Fedor Skrynnikov Jul 6 '11 at 15:46

source share

Aaron digulla · Accepted Answer · 2011-07-06T15:43:45+0000

If you cannot create a profile, look at the assembler code to see how expensive the search is. It can be a simple indirect leap that costs almost nothing.

If you need to reorganize, here is a sentence: Create a lot of "UpdateXxx" classes that know how to call the new non-virtual update() method. Collect them in an array, and then call update() on them.

But I assume that you will not save much, especially not only with 7K objects.

Profiling note: if you cannot use the profiler (makes me wonder why not), the time of calls to update() and logical calls that take more than, say, 100 ms. Time is not expensive and allows you to quickly determine which challenges are most expensive.

The cost of a virtual function in a closed loop

More articles: