Yes, this applies everywhere, but it is important to note in which context it is intended for use. It does not mean that the application as a whole crashes, which, as @PeterM pointed out, can be catastrophic in many cases. The goal is to create a system that generally never crashes, but can handle errors internally. In our case, these were telecommunication systems, which are expected to have an idle time of the order of minutes per year.
The main design is to fold the system and isolate the central parts of the system to control and control the other parts that do the work. OTP terminology has supervisory and work processes. Supervisory authorities carry out work to monitor workers and other supervisory authorities with the aim of their correct restart when they break up, when workers perform all the actual work. Correct structuring of the system in layers using this principle of strict separation of functionality allows isolating most of the error handling from workers in supervisors. You are trying to create a kernel with a small error, which, if it is correct, can handle errors anywhere in the rest of the system. It is in this context that the let-it-crash philosophy is intended to be used.
You get the paradox of where you think about mistakes and failures everywhere, with the goal of actually dealing with them in as many places as possible.
How best to handle the error, of course, depends on the error and the system. Sometimes itโs best to try to catch errors locally inside the process and try to handle them there, with the possibility of a second failure if this does not work. If you have several collaborative workflows, it is often best to corrupt them and restart them. This is the supervisor who does this.
You need a language that throws errors / exceptions when something goes wrong so that you can trap them or cause the process to crash. Just ignoring the returned error values โโis not the same thing.
rvirding Dec 09 '10 at 22:29 2010-12-09 22:29
source share