Intel recommends only using PAUSE instructions when the rotation cycle is very short.
As I understood from your questions, the expectations in your case are very long. In this case, spin loops are not recommended.
You wrote that you have a "stream that continues to scan some places (for example, a queue) to retrieve new nodes."
In this case, Intel recommends using the synchronization API functions for your operating system. For example, you can create an event when a new node appears in the queue, and just wait for this event using WaitForSingleObject(Handle, INFINITE) . The queue will raise this event whenever a new node appears.
According to the Intel Optimization Guide, the PAUSE command is typically used with software threads running on two logical processors located in the same processor core, waiting for the lock to be released. Such short wait cycles tend to last between dozens and several hundred cycles (i.e. 20-500 CPU cycles), so from a performance standpoint it is more useful to wait while taking up the processor than giving way to the OS.
500 cycles of the processor on the Core i7 7700K processor with a clock frequency of 4500 MHz is 0.0000001 seconds, that is 1/10000000 seconds of a second: the processor can be 10 million times per second in this cycle of 500 processor cycles.
As you can see, this PAUSE instruction is designed for very short periods of time.
On the other hand, every call to an API function, such as Sleep (), experiences the expensive cost of a context switch, which can be 10,000+ cycles; it also suffers from the cost of a ring of 3 to 0 transitions, which can be 1000+ cycles.
If the number of threads is greater, then the processor cores (multiplied by the hyper-thread function, if any) are available, and the thread will switch to another in the middle of the critical section, waiting for the critical section from another thread, really do looong for at least 10000+ cycles, so the command PAUSE will be useless.
For more information, see the following articles:
When it is expected that the wait cycle will last thousands of cycles or more, it is preferable to switch to the operating system by calling one of the functions of the OS synchronization API, such as WaitForSingleObject in Windows.
As a conclusion: in your scenario, the PAUSE command will not be the best choice, since your waiting time is long, and PAUSE designed for very short loops. PAUSE - A total of 131 cycles of SkyWell or later processors. For example, it's simple or 31.19ns on an Intel Core i7-7700K @ 4.20GHz Lake Kaby processor.
On earlier processors like Haswell, I have about 9 cycles. These are 2.81ns on Intel Core i5-4430 @ 3GHz. Thus, for long cycles, it is better to abandon the control of other threads using the functions of the OS synchronization API than to occupy the CPU using the PAUSE cycle.