How can I get my pthreads to execute a function every time they are wrapped by the kernel?
I need to determine which physical processor / socket (not the logical core) my thread is on, and cannot afford to do this all the time.
Can the wake-up procedure be connected in some way to make the necessary updates for TLS only when the flow is actually carried?
What I need: I have code that runs AMOs appx every 70 ns per thread, which is good if the address is not cached on another socket, deploying the same code on two sockets gives a 15x performance impact due to frequent cache invalidation. I intend to allocate memory specifically for this, which is used only for threads working with the same L3 cache. Therefore, I need to determine in which socket I run and address the correct memory block. I could obviously call sched_getcpu and compare this to the physical CPU ID in /proc/cpuinfo , but this is a pretty big overhead. I cannot afford to allocate thread-private memory for each thread, although it is too expensive.
source share