How fork () returns a child process

Question

How fork () returns a child process

I know fork () returns differently for child and parent processes, but I cannot find information on how this happens. How does a child process get return value 0 from fork? And what is the difference regarding the call stack? As far as I understand, for a parent it looks something like this:

parent process - calls fork → system_call - calls fork → fork performs - returns to → system_call - returns to → the parent process.

What happens in a child process?

+12

linux fork

EpsilonVector Apr 01 '10 at 18:15

source share

6 answers

The fork() system call returns twice (unless it works).

One of the returns is in the child process, and the return value is 0.
Another return is in the parent process, and there the return value is nonzero (either negative if the fork failed, or a non-zero value indicating the PID of the child).

The main differences between parent and child are:

These are separate processes.
PID value is different
PPID (Parent PID) value is different

Other more obscure differences are listed in the POSIX standard.

In a way, it’s really not your problem. To achieve the result, an operating system is required. However, o / s clones the parent process, making a second child process that is an almost exact copy of the parent, setting attributes that should be different from the correct new values, and usually marking the data pages as CoW (copy on write) or equivalent, so when one process changes the value, it receives a separate copy of the page so as not to interfere with another. This does not look like an outdated (at least to me non-POSIX-standard) vfork() system call that you would be wise to avoid, even if it is available on your system. Each process continues after fork() , as if the function returned - therefore (as I said above) the fork() system call returns twice, once in each of the two processes that are close to identical clones of each other.

+7

Jonathan Leffler Apr 01 '10 at 18:37

source share

Both parent and child return different values due to manipulation of the processor register in the child context.

Each process in the linux kernel is represented by task_struct. task_struct is wrapped (pointer) in the thread_info structure, which is located at the end of the kernel mode stack. This thread_info structure stores an empty processor context (registers).

 struct thread_info { struct task_struct *task; /* main task structure */ struct cpu_context_save cpu_context; /* cpu context */ }

All fork / clone () system calls invoke the do_fork () kernel equivalent function.

 long do_fork(unsigned long clone_flags, unsigned long stack_start, struct pt_regs *regs, unsigned long stack_size, int __user *parent_tidptr, int __user *child_tidptr)

Here is the execution sequence

do_fork () → copy_process-> copy_thread () (copy_thread is a function call of a special function)

copy_thread () copies the register values from the parent element and changes the return value to 0 (In the case of a hand)

 struct pt_regs *childregs = task_pt_regs(p); *childregs = *regs; /* Copy register value from parent process*/ childregs->ARM_r0 = 0; /*Change the return value*/ thread->cpu_context.sp = (unsigned long)childregs;/*Write back the value to thread info*/ thread->cpu_context.pc = (unsigned long)ret_from_fork;

When the child receives the scheduled task, he performs the assembly procedure ret_from_fork (), which returns zero. For the parent, it gets the return value from do_fork (), which is the process pid

 nr = task_pid_vnr(p); return nr;

+7

sysinit May 23 '14 at 2:10

source share

Steven Schlansker's answer is not bad, but just add a few details:

Each executable process has an associated context (hence, "context switching") - this context includes, among other things, a segment of the process code (containing machine instructions), its heap memory, stack, and its register contents. When a context switch occurs, context from the old process is saved and context is loaded from the new process.

The location for the return value is determined by the ABI to ensure code compatibility. If I write ASM code for my x86-64 processor and I call the C runtime, I know that the return value will be displayed in the RAX register.

Combining these two things, the logical conclusion is that calling int pid = fork() leads to two contexts, where the next command to execute in each of them is the one that moves the RAX value (the return value from the fork call) to the local pid variable. Of course, only one process can be executed simultaneously on one processor, therefore the order in which these "returns" will be determined by the scheduler.

+3

danben Apr 01 '10 at 18:40

source share

I will try to answer in terms of the process memory layout. Guys, please correct me if something is wrong or inaccurate.

fork () is the only system call to create a process (except for the initial process 0), so the real question is what happens to the creation process in the kernel. There are two kernel data structures associated with the process, the proc proc (aka process table) array and the struct user (aka u area).

To create a new process, these two data structures must be correctly created or parameterized. The straightforward way is to align with the scope of proc and u the creator (or parent). Most of the data is duplicated between the parent and child elements (for example, a code segment), except for the values in the return register (for example, EAX at 80x86), for which the parent has a child pid and child is 0. Since then you have two processes (existing and new) performed by the scheduler, and during planning, each of them will return its values accordingly.

+1

newID Mar 12 '13 at 16:22

source share

The process seems identical on both sides, except for a different return value (therefore, the return value is returned, so that both processes can distinguish the difference in general!). As for the process with the son, it will just be returned with system_call just as the parent process was returned.

0

Amber Apr 01 '10 at 18:18

source share

Steven Schlansker · Accepted Answer · 2010-04-01 18:19

% man fork

RETURN VALUES

Upon successful completion, fork() returns a value of 0 to the child process and returns the process ID of the child process to the parent process. Otherwise, a value of -1 is returned to the parent process, no child process is created, and the global variable [errno][1] is set to indi- cate the error.

What happens is that inside the fork system call the whole process is duplicated. Then the fork call returns. These are now different contexts, so they can return different return codes.

If you really want to know how this works at a low level, you can always check the source ! The code is a bit confusing if you are not used to reading the kernel code, but the inline comments give a pretty good hint of what is going on.

The most interesting part of the source with an explicit answer to your question is at the very end of the fork () definition itself -

 if (error == 0) { td->td_retval[0] = p2->p_pid; td->td_retval[1] = 0; }

"td" apparently contains a list of return values for different threads. I’m not sure exactly how this mechanism works (why there aren’t two separate “stream” structures). If the error (returned from fork1, the "real" forcing function) is 0 (no errors), then take the "first" (parent) thread and set its return value to p2 (new process) PID. If this is the "second" thread (in p2), set the return value to 0.

How fork () returns a child process

More articles: