Possible corruption in the stacks

Regarding my previous question about GDB not defining a SIGSEGV point ,

My stream code is as follows:

void *runner(void *unused) { do { sem_wait(&x); ... if(/*condition 1 check*/) { sem_post(&x); sleep(5); sem_wait(&x); if(/*repeat condition 1 check; after atleast 5 seconds*/) { printf("LEAVING...\n"); sem_post(&x); // putting exit(0); here resolves the dilemma return(NULL); } } sem_post(&x); }while(1); } 

Main code:

 sem_t x; int main(void) { sem_init(&x,0,1); ... pthread_t thrId; pthread_create(&thrId,NULL,runner,NULL); ... pthread_join(thrId,NULL); return(0); } 

Edit: Having output (0) in the code of the runners stream makes the error null.


What could be causing stack damage?

GDB Output: (0xb7fe2b70 is the creeping line identifier)

 LEAVING... Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xb7fe2b70 (LWP 2604)] 0x00000011 in ?? () 

Valgrind Output:

 ==3076== Thread 2: ==3076== Jump to the invalid address stated on the next line ==3076== at 0x11: ??? ==3076== by 0xA26CCD: clone (clone.S:133) ==3076== Address 0x11 is not stack'd, malloc'd or (recently) free'd ==3076== ==3076== ==3076== Process terminating with default action of signal 11 (SIGSEGV) ==3076== Bad permissions for mapped region at address 0x11 ==3076== at 0x11: ??? ==3076== by 0xA26CCD: clone (clone.S:133) ==3076== Address 0x11 is not stack'd, malloc'd or (recently) free'd 
+2
source share
3 answers

Write a new source file using the main function, which performs the same actions as main that you posted here, instead of using pthread_create just call the function. See if you can recreate the problem regardless of thread usage. From how things look, your semaphores should still work just fine in a single streaming environment.

If this still fails, it will be easier to debug it.

Since you said that calling exit instead of returning did not give an error, it would suggest that you damage either the return address on the stack when runner starts. When you call exit , you do not rely on this region of memory to go to the exit function (if you returned pthread_exit, it is called by the pthread library code called runner ). I think that the output of valgrind is not 100% accurate - not because of any error in valgrind, but because the place where you run the error in combination with the type of error that you run makes it very difficult to know who named that , what.

Some gcc flags that may interest you:

 -fstack-protector-all -Wstack-protector 

The warning option does not work without the -f option.

You can also try:

 -fno-omit-frame-pointer 
+6
source

Your code does not have all the important parts, but the most common causes of stack corruption:

  • Saving a pointer to an element on the stack and using it after the object has already left the area.
  • Buffer overflows, such as having char buffer[20] on the stack and writing off limits ( sprintf is a fantastic way to do this).
  • Bad listing, i.e. having a base class A on the stack, dropping it to a derived class and using it.
+2
source

Use valgrind or an equivalent memory test tool to figure this out. Stop guessing. Also stop posting incomplete code, especially if you don't know if it has a problem or not. An error may be outside this function. For example, the semaphore may not be initialized.

From valgrind's output, I can assume that your pthread_create() should contain an invalid function pointer. So pthread jumps to this fake address and crashes. Obviously there is no stack ...

+1
source

Source: https://habr.com/ru/post/1237218/


All Articles