On Linux / x86-64, the local storage stream is implemented through the special segment register %fs (for x86-64 ABI p. 23 ...)
So, the following code (I use the C + GCC extension __thread extension __thread , but it is the same as C ++ 11 thread_local )
__thread int x; int f(void) { return x; }
gcc -O -fverbose-asm -S (with gcc -O -fverbose-asm -S ) to:
.text .Ltext0: .globl f .type f, @function f: .LFB0: .file 1 "tl.c" .loc 1 3 0 .cfi_startproc .loc 1 3 0 movl %fs: x@tpoff , %eax # x, ret .cfi_endproc .LFE0: .size f, .-f .globl x .section .tbss,"awT",@nobits .align 4 .type x, @object .size x, 4 x: .zero 4
Therefore, turning to your fears, access to TLS very quickly works on Linux / x86-64. It is not exactly implemented as a table (instead, the kernel and runtime control the %fs segment register to indicate a thread-specific memory area, and the compiler and linker control the offset there). However, the old pthread_getspecific did go through the table, but is almost useless once you have TLS.
BTW, by definition, all threads in the same process share the same address space in virtual memory , because the process has its own address space . (see /proc/self/maps , etc. see proc (5) for more information on /proc/ , as well as mmap (2) ; the C ++ 11 thread library is based on pthreads that are implemented using clone (2) ). Thus, “matching memory by specific threads” is a contradiction: when a task (what is done by the kernel scheduler) has its own address space, it is called a process (not a thread). A defining characteristic of threads in the same process is the sharing of a common address space (and some other objects, such as file descriptors).