When testing a program for scalability, I came across a situation where I need to perform memcpy operation as an atomic operation. I need to copy 64 bytes of data from one place to another.
I came across one solution using swirl over a variable:
struct record{ volatile int startFlag; char data[64]; volatile int doneFlag; };
and pseudo code follows
struct record *node; if ( node->startFlag ==0 ) { // testing the flag if( CompareAndSwap(node->startFlag , 0 ,1 ) ) { // all thread tries to set, only one will get success and perform memcpy operation memcpy(destination,source,NoOfBytes); node->doneFlag = 1; // spinning variable for other thread, those failed in CompAndSwap } else { while ( node->doneFlag==0 ) { // other thread spinning ; // spin around and/or use back-off policy } }}
Could this be run as atomic memcpy? Although if the thread executing memcpy is unloaded (before or after memcpy, but before setting doneFlag), then others will continue to spin. Or what can be done to make this atom.
The situation is similar to another stream, you have to wait until the data is copied, because they must compare with the inserted data with their own data.
I use the test-and-test-and-set method in the case of startFlag in order to reduce some costly atom work. Spin-locks are also scalable, but I measured that atomic calls give better performance than spin-locks, moreover, I am looking for problems that may arise in this fragment. And since I use my own memory manager, so allocating memory and free calls are expensive for me, so using a different buffer and copying the contents in it, then setting the pointer (since the size of the pointer is under the atomic operation) is expensive, since it will require a lot calls mem-alloc and mem-free.
EDIT I do not use mutex because they do not seem to be scalable , and this is just part of the program, so the critical section is not so small (I understand that atomic operations are difficult to use for a larger critical section).
source share