What is a good way to have a common interface for code without the cost of dynamic searches?

I am writing code to process data. There are several groups of processing functions that can be selected by the user, which are then applied to the data set. I would like to implement all these groups in different places, but since they all take the same parameters, and they all do similar things that I would like to have a common interface for them.

Being a good C ++ programmer, my first thought was to just use polymorphism. Just create an abstract class with the desired interface, and then derive from it each set of processing objects. My hopes were quickly shattered when I thought of another wrinkle. These datasets are huge, resulting in these functions being called literally billions of times. Although dynamic search is pretty cheap, as I understand it, it is much slower than a standard function call.

My current idea to combat this is to use function pointers in an order similar to this:

void dataProcessFunc1(mpz_class &input){...} void dataProcessFunc2(mpz_class &input){...} ... class DataProcessInterface { ... void (*func1)(mpz_class); void (*func2)(mpz_class); ... } 

With some kind of constructor or something to set pointers to the right things.

So, I think my question is this: is this a good method? Is there another way? Or should I just learn to stop worrying and love dynamic searches?

+4
source share
4 answers

A virtual function call is a function call through a pointer. The overhead is usually about the same as an explicit function call through a pointer. In other words, your idea is likely to be very small (perhaps nothing at all).

My immediate reaction would be to start with virtual functions and only worry about something else when / if the profiler shows that the overhead of virtual calls becomes significant.

When / if this happens, another possibility would be to define an interface in the class template, and then put the various implementations of this interface in the template specialization. This usually eliminates all the overhead at runtime (although often this is enough additional work).

+7
source

I disagree with one answer above, which states that a template-based solution can have worse costs or lead time. In fact, solutions based on templates allow you to write code faster, eliminating the need for virtual functions or by pointer (I agree, however, that using this mechanism still does not impose significant overhead.)

Suppose that you are customizing your processing interface using a series of “attributes”, that is, you are processing parts or functions that can be configured by the client to configure the processing interface. Imagine a class with three (to see an example) processing parameterizations:

 template <typename Proc1, Proc2 = do_nothing, Proc3 = do_nothing> struct ProcessingInterface { static void process(mpz_class& element) { Proc1::process(element); Proc2::process(element); Proc3::process(element); } }; 

If the client has different "processors" with a static function "process" that know how to process an element, you can write a class to "combine" these three processes. Note that the do_nothing class has an empty method by default:

 class do_nothing { public: static void process(mpz_class&) {} }; 

These calls have no overhead. These are ordinary calls, and the client can configure processing using ProcessingInterface<Facet1, Facet2>::process(data); .

This only applies if you know different “faces” or “processors” at compile time, which is similar to your first example.

Note that you can write a more complex class using metaprogramming tools like boost.mpl library to include more classes, iterate through them, etc.

+3
source

The abstract interface approach is by far the cleanest in terms of coding and it is much preferable to obfuscate your code with pointers to functions that really program C in C ++.

Have you really determined that a performance problem is related to the interface?

It’s best to write readable and supported code first and only optimize if you need to.

+2
source

These datasets are huge, which leads to the fact that the functions in question are called literally billions of times. Although dynamic search is pretty cheap, as I understand it, it is much slower than a standard function call.

How many times how many times? If your application runs for an hour, billions of function calls are nothing and will not go into performance. But if the entire data set is processed in 100 ms, billions of function calls are a significant source of overhead. Just talking about how many times a function is called does not make sense. What is important is how often it is called. The number of calls per unit of time.

If this is a performance issue at all, I would go with a template. The user is not going to choose between each call what actions should be applied. He is going to make a decision once, and then all the billions of calls will be resolved.

Just define a class for each group of functions that the user can choose, make sure they expose the same interface (perhaps using CRTP to simplify the process a bit and easily define common code), and then, depending on the user's choice of strategy, pass the corresponding function class (templated), which is responsible for the execution of all processing.

But as other answers said, this may not be a performance bottleneck. Do not waste time optimizing code that does not matter.

+1
source

Source: https://habr.com/ru/post/1300569/


All Articles