Is a virtual method faster than a non-virtual method?

I recently read about Measure Early and Often for Performance, part 2 , it comes with source code and binary .

Extract from the article: "I emphasized that in order to reliably create high-performance programs, you need to understand the performance of the individual components that you use at the beginning of the design process."

So, I used my tool (v0.2.2) for comparison and tried to see the performance of individual components.

Under my computer (x64), the results are as follows:

Name Median Mean StdDev Min Max Samples NOTHING [count=1000] 0.14 0.177 0.164 0 0.651 10 MethodCalls: EmptyStaticFunction() [count=1000 scale=10.0] 1 1.005 0.017 0.991 1.042 10 Loop 1K times [count=1000] 85.116 85.312 0.392 84.93 86.279 10 MethodCalls: EmptyStaticFunction(arg1,...arg5) [count=1000 scale=10.0] 1.163 1.172 0.015 1.163 1.214 10 MethodCalls: aClass.EmptyInstanceFunction() [count=1000 scale=10.0] 1.009 1.011 0.019 0.995 1.047 10 MethodCalls: aClass.Interface() [count=1000 scale=10.0] 1.112 1.121 0.038 1.098 1.233 10 MethodCalls: aSealedClass.Interface() (inlined) [count=1000 scale=10.0] 0 0.008 0.025 0 0.084 10 MethodCalls: aStructWithInterface.Interface() (inlined) [count=1000 scale=10.0] 0 0.008 0.025 0 0.084 10 MethodCalls: aClass.VirtualMethod() [count=1000 scale=10.0] 0.674 0.683 0.025 0.674 0.758 10 MethodCalls: Class.ReturnsValueType() [count=1000 scale=10.0] 2.165 2.16 0.033 2.107 2.209 10 

I am surprised to see that the virtual method (0.674) is faster than the non-virtual instance method (1.009) or the static method (1). And the interface is not too slow! (I would expect the interface to be at least 2 times slower).

Since these results come from a reliable source, I wonder how to explain the above findings.

I donโ€™t think the article is out of date, the problem is because the article itself does not say anything about the evidence. All he did was provide a testing tool.

+4
source share
2 answers

I would suggest that the benchmarking methodology used in his example is erroneous. The following code running in LINQPad shows what you expect:

 /* This is a benchmarking template I use in LINQPad when I want to do a * quick performance test. Just give it a couple of actions to test and * it will give you a pretty good idea of how long they take compared * to one another. It not perfect: You can expect a 3% error margin * under ideal circumstances. But if you're not going to improve * performance by more than 3%, you probably don't care anyway.*/ void Main() { // Enter setup code here var foo = new Foo(); var actions = new[] { new TimedAction("control", () => { // do nothing }), new TimedAction("non-virtual instance", () => { foo.DoSomething(); }), new TimedAction("virtual instance", () => { foo.DoSomethingVirtual(); }), new TimedAction("static", () => { Foo.DoSomethingStatic(); }), }; const int TimesToRun = 10000000; // Tweak this as necessary TimeActions(TimesToRun, actions); } public class Foo { public void DoSomething() {} public virtual void DoSomethingVirtual() {} public static void DoSomethingStatic() {} } #region timer helper methods // Define other methods and classes here public void TimeActions(int iterations, params TimedAction[] actions) { Stopwatch s = new Stopwatch(); int length = actions.Length; var results = new ActionResult[actions.Length]; // Perform the actions in their initial order. for(int i = 0; i < length; i++) { var action = actions[i]; var result = results[i] = new ActionResult{Message = action.Message}; // Do a dry run to get things ramped up/cached result.DryRun1 = s.Time(action.Action, 10); result.FullRun1 = s.Time(action.Action, iterations); } // Perform the actions in reverse order. for(int i = length - 1; i >= 0; i--) { var action = actions[i]; var result = results[i]; // Do a dry run to get things ramped up/cached result.DryRun2 = s.Time(action.Action, 10); result.FullRun2 = s.Time(action.Action, iterations); } results.Dump(); } public class ActionResult { public string Message {get;set;} public double DryRun1 {get;set;} public double DryRun2 {get;set;} public double FullRun1 {get;set;} public double FullRun2 {get;set;} } public class TimedAction { public TimedAction(string message, Action action) { Message = message; Action = action; } public string Message {get;private set;} public Action Action {get;private set;} } public static class StopwatchExtensions { public static double Time(this Stopwatch sw, Action action, int iterations) { sw.Restart(); for (int i = 0; i < iterations; i++) { action(); } sw.Stop(); return sw.Elapsed.TotalMilliseconds; } } #endregion 

Results:

  DryRun1 DryRun2 FullRun1 FullRun2 control 0.0361 0 47.82 47.1971 non-virtual instance 0.0858 0.0004 69.6178 68.7508 virtual instance 0.1676 0.0004 70.5103 69.2135 static 0.1138 0 66.6182 67.0308 

Conclusion

These results show that invoking a method on a virtual instance takes a little longer (maybe 2-3% after considering the control) than invoking a method on a regular instance, which takes a little longer than a static call. This is what I expect.

Update

I played a little after @colinfang commented on adding the [MethodImpl(MethodImplOptions.NoInlining)] attribute to my methods, and I can conclude that micro-optimization is complicated. Here are a few notes:

  • As @colinfang says, adding NoInlining to methods gives more results than what he described. Not surprisingly, the inlining method is one of the ways that the system can optimize non-virtual methods faster than virtual ones. But it is surprising that not embedding would actually make virtual methods longer than non-virtual.
  • If I compile with /optimize+ , invoking a virtual instance actually takes less time than managing, by more than 20%.
  • If I remove the lambda functions and pass the group of methods directly like this:

     new TimedAction("non-virtual instance", foo.DoSomething), new TimedAction("virtual instance", foo.DoSomethingVirtual), new TimedAction("static", Foo.DoSomethingStatic), 

    ... then virtual and non-virtual calls end in about the same amount of time as one other, but a call to a static method takes much longer (more than 20%).

So, thatโ€™s weird. The fact is that when you reach this level of optimization, unexpected results will appear due to any number of optimizations on the compiler, JIT, or even on the hardware level. The differences that we see may be the result of something out of control, like a CPU L2 caching strategy. There will be dragons.

+5
source

There are many reasons why counterintuitive results could be obtained. One reason is that sometimes virtual calls (perhaps most of the time) generate the callvirt IL command to provide zero verification (possibly when searching for vtable). On the other hand, if the JIT can say for sure that only one specific implementation will be called at the location of the virtual call (and on a non-empty link), it is likely to try to turn it into a static call.

I think this is one of the few things that really shouldn't matter in the design of your application. You should consider the virtual / sealed language construct rather than the runtime (let the runtime do what it can do best). If the method should be virtual for your applications, make it virtual. If it does not need to be virtual, do not do it. And if you really are not going to base the design on your application on this, then there is no need to compare it. (Except for curiosity.)

0
source

Source: https://habr.com/ru/post/1479556/


All Articles