During development outside the application, we have a really unpleasant error in a special situation. The symptom is simple enough that the process disappears. Logs simply end abruptly, no crash dumps or anything can be found, no zombie processes exist. Dr. Watson did not notice anything, leaving us without a trace.
The error is not easy to reproduce, to reproduce this error takes an average of 3-4 hours, repeating the same steps. So somewhere there is some kind of race condition. We have special functions that handle both SEH and regular exceptions, so none of them should go unnoticed.
Debugging should be performed on a special computer, since it works on very specialized equipment. Thus, only remote debugging is available. And when remote debugging is connected, the C ++ builder did not notice that the application is missing, as well as crash and burns, when we try to perform any debugging in a nonexistent process.
We use a wide variety of technologies with this software:
- Opengl
- Directshow + some COTS filters
- COM / DCOM
- Serial COM ports talking to specialized equipment
- C ++ Builder (which uses different stack frames than VC ++)
So, as you know, I have nothing to work with. What I am doing now is that I am trying to narrow it down by going to different places in the code to find if there is any particular point in the code where the error occurs. I am also trying to remove as many aspects of the action that I perform as possible in order to make the case as simple as possible. But this is a very complicated operation, and this process takes a lot of time, and time (as usual) is a scarce resource.
I am wondering if anyone has any good advice for me, either the reason (in general, what makes the process just stop without any notice), or the methods of debugging such an elusive crash?
source share