Problem with a damaged stack in a C / C ++ program

I run a program in C / C ++ on Linux servers to display a video. The main functionality of the program (for example, called Plugin) is to convert the video, and we deploy a separate Plugin process for each video request. But I have a strange problem for which sometimes the average server load becomes unexpectedly high. What I see from the top team at this stage is that there are some processes that work for a long time and take some huge processors.

When I debug this running program using gdb and backtrace stack, I found a damaged stack: "Previous frame inside this frame (damaged stack?)". I searched the network and found that this happens if the program receives a segmentation error.

But what I know, if the program receives a segmentation error, the program should fail and exit at that moment. But it is surprising that the program still works after segmentation failure.

What could be the reasons for this? I know that the program should have some big problems, but I just canโ€™t figure out where to start fixing the problem ... It would be great if any of you could show me some lights ...

Thank you in advance

+4
source share
5 answers

Attaching a debugger changes the behavior of the process, so you most likely will not get reliable investigation results. A damaged stack message from the debugger may mean that a particular debugger does not understand the text information from the binary file.

I would recommend starting pstack several times later on the problematic one (this is called "Monte Carlo performance profiling"), as well as binding strace or truss to the problematic and checking which system calls are the process that executes when the CPU is consumed.

+2
source

Run your program under Valgrind and fix all found invalid entries in memory.

+1
source

Some optimizations, such as omitting the frame pointer, can make it difficult for the debugger to understand the stack.

0
source

If you have code, compile the program in debugging and run Valgrind.

If you do not have the code, contact the author / provider of the program.

A corrupt stack message means that the code is doing something strange with memory. This does not mean that the program has a segmentation error. In addition, the program can still work if it decides to process the SIGSEGV signal.

If by forking you mean that you have some kind of process that spawns and starts other smaller processes, just keep track of such spikes and restart the process. This assumes that you do not have access to fix the program.

0
source

There may be some interesting stack manipulation performed using assembler manipulation, for example, optimization of true tail recursion, self-modifying code, irrecoverable functions, etc., which may lead to the debugger not being able to return correctly - Attracting the stack and prompting him to cause a damaged stack error, but that does not necessarily mean that the memory is damaged ... but definitely something unconventional happens under the hood.

0
source

Source: https://habr.com/ru/post/1348153/


All Articles