Why do system calls return EFAULT instead of sending segfault?

To be clear, this is a project, not an implementation issue.

I want to know why POSIX behaves like this. POSIX system calls when using an invalid memory cell return EFAULT rather than a user space program crash (by sending sigsegv), which makes their behavior incompatible with user space functions.

Why? Doesn't that just hide memory errors? Is this a historical mistake or is there a good reason for this?

+6
source share
2 answers

Since system calls are made by the kernel and not by the user program — when a system call occurs, the user process stops and waits for the kernel to finish.

The kernel itself, of course, is not allowed to segregate, so it must manually check all the address areas that the user process provides. If one of these checks fails, the system call fails with an EFAULT error. Thus, in this situation, the segmentation error does not actually occur - it avoided the kernel by explicitly checking that all addresses were valid. Therefore, it makes sense that no signal is sent.

In addition, if a signal were sent, the kernel could not connect a meaningful program counter to the signal; the user process is not actually executed when the system call is launched. This means that the user process will not be able to provide decent diagnostics, restart a failed command, etc.

To summarize: mostly historical, but there is an actual logic of reasoning. Like EINTR , this does not make it less annoying to deal with.

+2
source

Well, what would you like. A system call is a request to the system. If you ask: "When does the ferry leave for Munich?" Do you need a program to crash or get return = -1 with errno = ENOHARBOR? If you ask the system to put your car in your purse, would you like your purse to be destroyed, or return -1 with the error set in EBAGTOOSMALL?

There is technical information: before or after system calls, the arguments // user / system -land must be converted (copied) when entering / exiting a system call. Mostly for security reasons, the system is very reluctant to write to user space. (For this, Linux has a copy_to_user_space function (and vice versa), which checks the credentials before performing the actual copy)

+2
source

Source: https://habr.com/ru/post/910186/


All Articles