How much should I worry about the Intel C ++ compiler, which emits suboptimal code for AMD?

We have always been an Intel store. All developers use Intel machines, the recommended platform for end users is Intel, and if end users want to run on AMD, then this is their opinion. Maybe there is an AMD computer in the test department where we can verify that we did not send anything completely broken, but that was about it.

Until a few years ago we simply used the MSVC compiler, and since it does not actually offer many processor settings outside the SSE level, no one is too worried about whether this code can approve one x86 provider compared to another . However, recently we have been using Intel compiler a lot. Our things definitely get some significant benefits from this (on our Intel hardware), and its vectorization capabilities mean less need to switch to asm / intrinsics. However, people are starting to get a little nervous about whether the Intel compiler can really not do such a good job for AMD hardware. Of course, if you go into the Intel CRT or IPP libraries, you will see many cpuid requests, apparently to set up transition tables for optimized functions. It seems unlikely that Intel will go into big trouble to do something good for AMD chips, though.

Can anyone with experience in this area comment on whether this is a big deal or not in practice? (In fact, we have yet to conduct some kind of performance testing on AMD).

Update 2010-01-04 . Well, the need for AMD support never became concrete for me to do the testing myself. There are some interesting comments on the issue here , here and here .

Update 2010-08-09 . It seems that the Intel-FTC solution has something to say about this problem - see the section “Compilers and dirty tricks” in this article .

+28
c ++ optimization compiler-construction intel amd-processor
May 8 '09 at 12:55
source share
7 answers

Buy an AMD box and run it. This seems like the only responsible thing, and not that of trusting strangers on the Internet;)

In addition, I believe that part of AMD’s lawsuit against Intel is based on the claim that the Intel compiler specifically creates code that is inefficient on AMD processors. I don’t know if this is true or not, but AMD seems to think so.

But even if they are not intentionally doing this, there is no doubt that the Intel compiler is optimized specifically for Intel processors and nothing more.

When this is said, I doubt that it will be of great importance. The AMD CPU will benefit from all auto-injection and other smart compiler features anyway.

+16
May 08 '09 at 1:10 pm
source share

What we saw is that wherever the Intel compiler has to make the choice of execution time an accessible set of instructions, if it does not recognize the Intel processor, it goes into its "standard" code (which, as you might expect, may not be optimal) .

Please note that even if I used the word “compiler” above, this happens mainly in their supplied (pre-compiled) libraries and built-in functions that check the set of commands and call the best code.

+5
Jan 05 '10 at 17:02
source share

I'm sure explicit, if performance is critical to your application, then you'd better do some testing - in all hardware / compiler combinations. There are no guarantees. As outsiders, we can only give you our guesses / prejudices. Your software may have unique characteristics that, unlike what we saw.

My experience:

I worked at Intel and developed my own (C ++) application where performance was critical. We tried to use the Intel C ++ compiler and always when running gcc - even after doing profile runs, recompiling using the profiled information (which icc supposedly uses for optimization) and restarting on the same data set (this was in 2005-2007 years, now everything can be different). So, based on my experience, you can try gcc (in addition to icc and MSVC), maybe you will get better performance this way and try to resolve this issue. It shouldn't be too hard to switch compilers (if your build process is reasonable).

Now I work for another company, and IT experts conduct extensive hardware testing, and for a while the Intel and AMD hardware were relatively comparable, but the latest generation of Intel hardware was significantly superior to AMD. As a result, I believe that they have acquired a significant number of Intel processors and recommend them to our customers who run our software.

But back to the question of whether the Intel compiler is specifically designed for AMD hardware to run slowly. I doubt that Intel is worried about this. Perhaps some optimizations that use knowledge of the internal components of the architecture or Intel chipsets may work more slowly on AMD equipment, but I doubt that they are specifically aimed at AMD equipment.

+4
May 08 '09 at 17:08
source share

Sorry if you pressed my general button.

This is due to optimization at a low level, so for the code it matters only that 1) the program counter spends a lot of time and 2) the compiler really sees it. For example, if a PC spends most of its time in library programs that you do not compile, it does not really matter.

Are conditions 1 and 2 satisfied, my optimization experience:

Several iterations of fetching and fixing were performed. In each of them, the problem is identified, and most often it is not about where the program counter is located. Most likely, at the level of the call stack levels, there are function calls that, since performance is of the utmost importance, can be replaced. To quickly find them, I do this.

Keep in mind that if there is a command to call a function that is on the stack for a significant part of the execution time, whether in several long calls or in very many short ones, this call is responsible for this fraction of the time, so deleting it or executing it less often can save a lot of time. And this savings is far superior to any low-level optimization.

Now the program can be many times faster than it began. I have never seen a single program of good size, no matter how carefully it is written, which could not benefit from this process. If the process has not been completed, you should not assume that low-level optimization is the only way to speed up the program.

After this process has been carried out to such an extent that it is simply impossible to do, and if the samples show that the PC is in the code that the compiler sees, then low-level optimization can make a difference.

+2
May 11 '09 at 11:51
source share

At the time this thread was started, Microsoft C ++ by default refused to generate code, which in some cases was good for AMD and bad for Intel. Their later compilers, by default, use the blend option, which is good for both, especially after both processor brands have developed their own specific performance errors. When I first worked at Intel, their compilers reserved some optimizations for Intel architecture settings. I suggest that this could be the subject of some FTC statements, although this did not affect my 10-hour readings, and this practice was already on the way due to convergence of optimization requirements between modern CPU models and more efficient use of compiler development time is needed. If you used one of these legacy compilers on an updated Intel processor, you might see some of the same performance flaws.

+2
Feb 10 '16 at 0:45
source share

It makes no sense to worry if you cannot act. Possible actions: Do not buy AMD or use another compiler. Therefore, the following things are obvious:

(1) Buy one AMD box and measure the speed of code compiled using the Intel compiler. Is it fast enough? If yes, everything is ready, you can buy AMD, do not worry.

(2) If not: compile the code with another compiler and run it in the AMD field. Is it fast enough? If not, everything is ready, you cannot buy AMD, do not worry.

(3) If yes: Run the same code in the Intel field. Is it fast enough? If yes, everything is ready, you can buy AMD, but switch compilers, do not worry.

(4) If not: Features: Do not buy AMD, do not drop all Intel computers, or compile with two different compilers. Choose one.

0
Jul 14 '14 at 15:02
source share

I immediately experienced deliberate disruption of technology when a vendor tried to stop a Lotus product from entering the market before they were offered. Working technology is available, but Lotus has banned its use. Good...

A few years ago, blogs appeared that showed users that fixing one byte in the Intel compiler made it emit “optimal” code, which was not crippled when used in AMD. I have not searched for these blog posts for years.

I am inclined to believe that such competitive behavior continues. I have no other evidence.

0
Sep 29 '15 at
source share



All Articles