Optimization Methods in C ++

Is there a website that provides an excellent list of common optimization methods in C ++?

What I mean by optimization is that you need to change the source code to be able to run the program faster without changing the compiler settings.

+21
c ++ optimization c gcc g ++
Mar 17 '09 at 12:20
source share
17 answers

Two ways to write better programs:

Best use language

  • Code Complete Steve McConnell
  • Effective C ++
  • Exceptional C ++

your application profile

  • Determine which areas of code take how long
  • See if you can use the best data structures / algorithms to speed things up.

There is not much language optimization that can be done - it is limited to using language constructs (learn from # 1). The main advantage comes from No. 2 above.

+17
Mar 17 '09 at 12:31
source share

I will repeat what others said: the best algorithm will win in terms of improving performance.

However, I work in the processing of images, which, as problematic, can be more stylish. For example, many years ago I had a piece of code that looked like this:

void FlipBuffer(unsigned char *start, unsigned char *end) { unsigned char temp; while (start <= end) { temp = _bitRev[*start]; *start++ = _bitRev[*end]; *end-- = temp; } } 

which rotates the 1-bit frame buffer 180 degrees. _bitRev - a table with 256 bytes of reverse bits. This code is about as hard as you can get it. He worked on an 8-MHz controller for a 68,000 laser printer. And it took about 2.5 seconds for paper with a limited paper size. To save you from the details, the client could not stand 2.5 seconds. The solution was the same algorithm. The difference was that

  • I used a 128K table and used words instead of bytes (68K is much happier than words)
  • I used the Duff device to expand the loop in the same way as in the short branch.
  • I turned on optimization to skip empty words
  • I finally rewrote it in the assembly to use the sobgtr instruction (subtract one and fork for more) and have a “free” increment and pre-decrements in the right place.

So 5x: there is no algorithm change.

The fact is that you also need to understand your problem area and what are the bottlenecks. When processing images, the algorithm is still king, but if your cycles do the extra work, multiply that work by several million and pay for it.

+24
Mar 17 '09 at 12:58
source share

Do not forget a few things:
“We must forget about little efficiency, say, about 97% of the time: premature optimization is the root of all evil.” (c) Donald Knut
“We could get more if we optimize the algorithms, not the code.”
- We optimize only the slow parts of the existing code that will be detected by the profiler or other special tool.

+10
Mar 17 '09 at 12:35
source share

Agner Fog did an excellent job analyzing the output of several compilers regarding C ++ constructs. Here you will find his work: http://www.agner.org/optimize/ .

Intel offers an excellent document, the Intel® 64 and IA-32 Architecture Optimization Reference Guide, which can be found at http://www.intel.com/products/processor/manuals/index.htm . Although it is primarily intended for IA-32 architectures, it provides general guidelines that can be applied on most platforms. Obviously, this and the Agner Fog guide overlap a bit.

As mentioned in other answers, micro-optimization is, of course, the last step you want to take to make your program faster, after profiling and choosing an algorithm.

+9
Mar 17 '09 at 12:42
source share

You may be interested: C ++ Wikibook Optimization

+5
Oct 13 2018-10-10
source share

I don’t have a site with my head, but Sutter’s Exceptional C ++ book is excellent for developing C / C ++. I highly recommend that every C ++ programmer read this book, as it gives an idea of ​​not only optimization, but also the clever use of the language, so that you really program really.

+4
Mar 17 '09 at 12:24
source share

In other engineering disciplines, it is customary to indicate budgets for system components. For example, VTOL aircraft engines are designed to provide a certain level of lift, so the weight should be within the limit. At a high level, each part of the aircraft is provided with a portion of the weight budget with which it must comply.

The reason this is done from top to bottom, instead of waiting until it is too inflated to get out of the deck and then weighing each part and delivering the bit from the heaviest bit, is partly due to the costs of changing fabricated components. But most of it is that if you create a system where everything will slightly exceed the budget everywhere, you cannot just fix it in one place.

A classic example of software is SGI Indy Irix 5.1 , in part because of which users with intensive graphics now have Mac and Windows computers, rather than SGI boxes.

“The worst thing about 5.1 performance is that no one knows exactly where. If you start asking, you get a lot of finger points and theories, but few facts. In a May report, I suggested a“ 5% theory, ”which says that each added item (Motif, internationalization, drag and drop, DSO, several fonts, etc.) costs about 5% of the car. After 15 or 20 of them, most of the productivity is gone. "

Often when discussing performance, 5% is considered insignificant, and we advise you to wait until a problem occurs, and then find one bottleneck. For a large system, waiting for you to have problems, you can simply lose your core business.

+3
Mar 17 '09 at 13:04
source share

++ p is usually faster than p ++, and -p faster than p--, especially for type objects with overloaded prefix and postfix increments and decrements, since the prefix form simply increases or decreases something and returns a new value, then how the postfix form increases or decreases something, but must maintain the previous value somewhere in order to return it. That is, instead (replace int with your favorite class here)

 for ( int i ( 0); i < x; i++) 

always write

 for ( int i ( 0); i < x; ++i) 
+2
Mar 17 '09 at 13:04
source share

You have requested sites / sources containing optimization.

Some good ones have been suggested.

I would add that almost everyone says profiling is the best, if not the only way to find performance issues.

I'm not sure where this folk wisdom came from or how it was justified, but there is a better way .

ADDED:

It is true that a “wrong algorithm” can kill performance, but this, of course, is not the only way.

I set up a lot. On large software that usually kills performance, too much data structure and too many layers of abstraction.

It seems that the innocent one-line call method for abstract objects makes you forget that this call may cost you. Multiply this trend by several levels of abstraction, and you will find things like, for example, spending all your time distributing and collecting things such as iterators and collection classes, when simple arrays with indexing would be sufficient (and no less supported), but less "correct." "

This is a problem with "common wisdom." This is often the opposite of wisdom.

+2
Mar 17 '09 at 13:54
source share

Most methods depend on the compiler, as different compilers optimize differently.

If you want some optimization tips to be agnostic compilers, here are two for you:

  • Do not do this.
  • (only for experts!): Don't do it yet.

(apologies Michael A. Jackson )

+2
Mar 17 '09 at 21:28
source share

Most optimizations are agnostic. Understand your code and understand the hardware you are running on, and you can perform most low-level optimizations.

Understand your problem domain and suitable algorithms, and you can perform any high-level optimizations.

The only C ++ optimization tip I can think of is to "understand what your code means." Understand when C ++ copies time series around, understands which constructors and destructors are called when.

And prefer functors to indicate pointers, since the former can be built in by the compiler. In general, move as much compilation time as possible, rather than run time. Use templates for heavy lifting.

And, of course, do not try to optimize until you have profiled and established that 1) optimization is necessary and 2) what needs to be optimized.

Edit: Comment commented that functors against function pointers are inline. Here is an explanation:

Functions are usually compiled separately. So, what does the compiler know about the F function, which takes a pointer to the FP function as an argument? Nothing, he should look for where F is called, and maybe there he can find a specific key as to what FP points to. If he can determine that when calling from here FP will ALWAYS point to function G, then yes, it can make an embedded version of F with an embedded inside it for this particular call site. But he cannot simply embed G without superimposing F, because F can be called from other places where another function pointer is passed to it. And even then, it requires some costly global optimizations to determine if something can be embedded.

Imagine you are passing a functor like this:

 struct Ftor { void operator()() { ... } }; 

so the function F looks like this:

 void F(const FTor& ft) { ... ft(); ... } 

Now the compiler knows exactly which function is being called: line 2 in the function calls the Ftor :: operator () function. Thus, the challenge can be easily challenged.

Of course, in practice, you usually configure it, so the function can be called with any type of functor:

 template <typename F> void F(const F& ft) { ... ft(); ... } 
+1
Mar 17 '09 at 12:34
source share
  • Profile your code to find out what actually takes the most time.
  • Make sure you are using the correct algorithm.
  • If you do a lot with crunching, be friendly with your cache and try to do everything possible on a piece of data at the same time, so you do not need to load it into the cache many times.
+1
Mar 17 '09 at 12:35
source share

Sorry, I have no links for you, but I have one more joke to add to the heap.

I had a rather large std :: map, which I generated using the Microsoft CString object as a key. Performance was unacceptable. Since all my lines were the same in length, I created a class wrapper around an old-fashioned array of fixed-size characters to emulate the CString interface. Unfortunately, I do not remember the exact acceleration, but it was significant, and the final performance was more than sufficient.

Sometimes you need to learn a little about the library designs you rely on.

+1
Mar 17 '09 at 21:40
source share

Meta-Programming can be used to move from dynamic polymorphism to compilation of temporary polymorphism, creating insanely optimal code in the process. Alexandrescu Modern C ++ Design is a good book that details TMP. Not every page is dedicated to optimization, but it is a frequently repeated consideration in program designs.

+1
Feb 13 '12 at 7:52
source share

Better optimization can be achieved by revising the design and after profiling the appropriate application components / algorithms. Usually it is language independent.

I mean (as an idea), if you get a 30% performance improvement by choosing a slightly better algorithm (or collection / container class), the improvement you can expect from C ++ related refactoring will be no more than 2 %, Design improvements can give you something above 30%.

If you have a specific application, the best strategy is to measure and profile the application. Profiling provides, as a rule, the most instant idea of ​​which parts relate to performance.

0
Mar 17 '09 at 12:32
source share

Here are a couple of catch all the ways to optimize.

There is no problem one for optimization tasks ... they are always manually configured for hardware / software / system considerations.




Assuming you have a better algorithm:

  • compile with "show assembly output" and "highest optimization"
  • Look at the assembly
  • identify inefficiencies that disrupt compiler optimization or poor caching
  • Repeat snippet code
  • if it is still a bad cycle returns to 2.
  • done

An example here: What is the fastest way to exchange values ​​in C?




General tips:

http://www.ceet.niu.edu/faculty/kuo/exp/exp6/figuree6-1.jpg :

  • Try floating point first
  • Try per second with a fixed point
  • If you are really disproportionate and have a lot of time and money, try it in the assembly

http://www.asc.edu/seminars/optimization/fig1.gif :

  • Check if it is memory or I / O or CPU connected
  • Attack that is ever the limiting factor



0
Mar 17 '09 at 15:14
source share

Contrary to what many people have said, there are many optimization options you can do. This is a great resource at Wikibooks . Keep this in mind when developing code, then profile, profile, profile.

0
Sep 29 '16 at 18:06
source share



All Articles