I wrote a C application that works fine, except for very large datasets as input.
With a lot of input, I get a segmentation error at the ends of the binary function.
I ran the binary (with test input) using valgrind :
valgrind --tool=memcheck --leak-check=yes /foo/bar/baz inputDataset > outputAnalysis
This work usually takes several hours, but it took seven days with valgrind .
Unfortunately, at the moment I do not know how to read the results that I get from this run.
I get a lot of these warnings:
... ==4074== Conditional jump or move depends on uninitialised value(s) ==4074== at 0x435900: ??? (in /foo/bar/baz) ==4074== by 0x439CC5: ??? (in /foo/bar/baz) ==4074== by 0x400BF2: ??? (in /foo/bar/baz) ==4074== by 0x402086: ??? (in /foo/bar/baz) ==4074== by 0x402A0F: ??? (in /foo/bar/baz) ==4074== by 0x41684F: ??? (in /foo/bar/baz) ==4074== by 0x4001B8: ??? (in /foo/bar/baz) ==4074== by 0x7FEFFFF57: ??? ==4074== Uninitialised value was created ==4074== at 0x461D3A: ??? (in /foo/bar/baz) ==4074== by 0x43F926: ??? (in /foo/bar/baz) ==4074== by 0x416B9B: ??? (in /foo/bar/baz) ==4074== by 0x416725: ??? (in /foo/bar/baz) ==4074== by 0x4001B8: ??? (in /foo/bar/baz) ==4074== by 0x7FEFFFF57: ??? ...
There are no hinted pieces of code, no variable names, etc. What can I do with this information?
In the end, I finally get the following error, but - as with smaller datasets that don't crash - valgrind does not detect leaks:
... ==4074== Process terminating with default action of signal 11 (SIGSEGV) ==4074== Access not within mapped region at address 0x7158E7F7 ==4074== at 0x7158E7F7: ??? ==4074== by 0x4020B8: ??? (in /foo/bar/baz) ==4074== by 0x6322203A22656D6E: ??? ==4074== by 0x306C675F6E557267: ??? ==4074== by 0x202C22373232302F: ??? ==4074== by 0x6D616E656C696621: ??? ==4074== by 0x72686322203A2264: ??? ==4074== by 0x3030306C675F6E54: ??? ==4074== by 0x346469702E373231: ??? ==4074== by 0x646469662E34372F: ??? ==4074== by 0x722E64616568656B: ??? ==4074== by 0x63656D6F6C756764: ??? ==4074== If you believe this happened as a result of a stack ==4074== overflow in your program main thread (unlikely but ==4074== possible), you can try to increase the size of the ==4074== main thread stack using the
All that I allocate for space gets the equivalent free operator, after which I set the pointers to NULL .
At this point, what is the best way to debug this application to determine what else causes the segmentation error?
December 22, 2011 - Change
I compiled a debug version of my binary called debug-binary using the following compilation flags:
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE=1 -DUSE_ZLIB -g -O0 -Wformat -Wall -pedantic -std=gnu99
When I run it with valgrind , I don't get much more information:
valgrind -v --tool=memcheck --leak-check=yes --error-limit=no --track-origins=yes debug-binary input > output
Here is a snippet of output:
==25116== 2 errors in context 14 of 14: ==25116== Invalid read of size 4 ==25116== at 0x4045E8: ??? (in /foo/bar/debug-binary) ==25116== by 0x40682F: ??? (in /foo/bar/debug-binary) ==25116== by 0x404F0C: ??? (in /foo/bar/debug-binary) ==25116== by 0x401FA4: ??? (in /foo/bar/debug-binary) ==25116== by 0x402016: ??? (in /foo/bar/debug-binary) ==25116== by 0x403B27: ??? (in /foo/bar/debug-binary) ==25116== by 0x40295E: ??? (in /foo/bar/debug-binary) ==25116== by 0x31A021D993: (below main) (in /lib64/libc-2.5.so) ==25116== Address 0x539f188 is 24 bytes inside a block of size 48 free'd ==25116== at 0x4A05D21: free (vg_replace_malloc.c:325) ==25116== by 0x401F6B: ??? (in /foo/bar/debug-binary) ==25116== by 0x402016: ??? (in /foo/bar/debug-binary) ==25116== by 0x403B27: ??? (in /foo/bar/debug-binary) ==25116== by 0x40295E: ??? (in /foo/bar/debug-binary) ==25116== by 0x31A021D993: (below main) (in /lib64/libc-2.5.so)
Is it a problem with my binary or with the system library ( libc ) that my application depends on?
I also do not know what to do with the interpretation of the entries ??? . Is there another compilation flag to get valgrind to provide additional information?