Most errors are fairly simple, easy to reproduce, and easy to debug. What do you do when you come across those that are difficult or impossible to reproduce under the debugger, i.e. Of these ?
Our application is a multi-threaded application, which, in addition, is complicated by the fact that it communicates with several clients through remote access, and sometimes there are errors that can take weeks to track, and sometimes we can’t even be sure that the problem resolved due to its inconsistency, perhaps it is just a coincidence that the problem has not been noticed for a while.
We already have an error reporting system, so if we are lucky and the error throws an exception, we will get a stack trace, but even this is not always enough, because it is not obvious from the stack, for example, how a certain value turned out (for example). This is especially true when an exception occurs in a workflow (which most often happens in this case.
And then you have those that don't even throw exceptions, it's just unexpected behavior. But this happens only a small percentage of times.
This is in .NET, so some of the memory / pointer operations are hidden, but we have many third-party components that are not controlled by code and enough COM interaction, so it's still a bit confusing.
Obviously, there are no simple answers, since I am not asking about a specific error, but what general principles and tactics of concepts need to be solved to solve these problems?