Linux has less code than Windows

After changing my C code (originally written for Windows and compiled for VS 2008), I ran it on Linux. To my surprise, it is now at least 10 times slower than the Windows version.

Using the Profiler tools, I realized that the following function consumes most of the time spent in the application:

/* advance by n bits */ void Flush_Buffer(N) int N; { int Incnt; ld->Bfr <<= N; Incnt = ld->Incnt -= N; if (Incnt <= 24) { if (System_Stream_Flag && (ld->Rdptr >= ld->Rdmax-4)) { do { if (ld->Rdptr >= ld->Rdmax) Next_Packet(); ld->Bfr |= Get_Byte() << (24 - Incnt); Incnt += 8; } while (Incnt <= 24); } else if (ld->Rdptr < ld->Rdbfr+2044) { do { ld->Bfr |= *ld->Rdptr++ << (24 - Incnt); Incnt += 8; } while (Incnt <= 24); } else { do { if (ld->Rdptr >= ld->Rdbfr+2048) Fill_Buffer(); ld->Bfr |= *ld->Rdptr++ << (24 - Incnt); Incnt += 8; } while (Incnt <= 24); } ld->Incnt = Incnt; } } 

This function took a little time on the windows. on Linux, it takes about 14 seconds. What have I done here?

There are no system calls here, therefore this section of the code should be independent of specific OS calls and, therefore, should work at the same time.

(My Guess: this function is called several times, so maybe the profiler also accumulates the time of all calls. In this case, I think that one of the problems may be that the function does not receive its input parameter quickly compared to the Windows case.)

What have I done here? Any guess?

Rgrds,

H

+4
source share
2 answers

You can try to annotate all code codes in your code with counters. At the end of the program, each counter will contain information on how many times the code path has been executed. Comparing these numbers in turn between a version of Windows and a version of Linux may seem like the program is executing different code paths. Depending on the nature of the code paths, the differences may explain why the Linux version is slower than the Windows version.

 int count[100]; // Call this function at the end of program void PrintCounts() { int i; for(i=0; i<100; i++) printf("%d\n", count[i]); } void Flush_Buffer(int N) { int Incnt; ld->Bfr <<= N; Incnt = ld->Incnt -= N; if (Incnt <= 24) { count[0]++; if (System_Stream_Flag && (ld->Rdptr >= ld->Rdmax-4)) { count[1]++; do { count[2]++; ... 
0
source

This is more of a note than an answer, but it doesn’t quite fit in the comment, so I hope you will not hold it against me.

The term “profiling” has several related but different meanings. In an abstract context, this means “measuring” your program, usually with respect to certain run-time data. However, this is not the same as simply “synchronizing” your program. Timing is one form of profiling, but there are many others.

For example, suppose you are not sure if any data structure should be std::set (tree) or std::unordered_set (hash table). There is no universal answer, because it depends on what you use it for and what data you process. It is possible that you cannot know the correct answer until you provide the actual data of the real world that you are going to use. In this case, “profile and solution” means that you make two versions of your program, run them against real data, and measure the execution time. Most likely, this is the one you need.

GCC, on the other hand, has a tool called a profiler that serves a completely different purpose. This is the execution path profiler, if you like, which tells you where (i.e. in which function) your program spends most of its time. If you have a complex algorithm with many routines, you may not know which ones are important, and again this may depend on your actual input. In this case, the profiler can help you determine which functions are called the most given of your input data, and you can focus on optimizing these functions. Now “profile before optimization” means that you need to prioritize before getting started.

However, for the comparison you have in mind, you should not use the GCC profiler. Rather, compile on both platforms with optimizations enabled and optimized, and then measure the execution time on the same set of input data.

+1
source

Source: https://habr.com/ru/post/1399544/


All Articles