Why isn't converting a conditional record to an unconditional record stream-optimized?

In a conversation about concurrency and the C ++ 11 memory model, Herb Sutter gives examples of invalid optimizations.

http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2

From slide per minute 17:

void f(vector<widget>& v) {
    if(v.length()>0) xMutex.lock();
    for(int i = 0; i < v.length(); ++i)
        ++x;                                  // write is conditional
    if(v.length()>0) xMutex.unlock();
}

"A very likely (if deeply mistaken) transformation of the central loop:"

r1 = x;
for(int i = 0; i < v.length(); ++i)
    ++r1;                                     // oops: write is not conditional
x = r1;

He explains: "... this record is not conditional, it will be executed at each execution, even if doOptionalWork is false, which will lead to a record of the record that is not protected by a mutex lock that introduces the race ..."

Why does he say that the invented record is not protected by a mutex lock? I realized that the complete conversion is described as follows.

// "optimized" version 1
void f(vector<widget>& v) {
    if(v.length() > 0) xMutex.lock()
    r1 = x;
    for(int i = 0; i < v.length(); ++i)
        ++r1;
    x = r1;
    if(v.length() > 0) xMutex.unlock();
}

But this may be so.

// "optimized" version 2
void f(vector<widget>& v) {
    if(v.length() > 0) xMutex.lock()
    r1 = x;
    for(int i = 0; i < v.length(); ++i)
        ++r1;
    if(v.length() > 0) xMutex.unlock();
    x = r1;
}

, 2 , 1. 1? , , x ?

" v.length() 0, ...", , . , .

+4
4

, -. , , .

+5

, , :

xMutex.lock()
++x;
xMutex.unlock();

() f , x , .

+2

mutex. lock() - , . - , lock() . , , unlock().

"", .

.

, lock() unlock() .

, , , , , . ( ) , . , , .

0

, , .

#include <chrono>
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>

// we have very simple widgets
typedef int widget;

// x and its mutex are shared by the threads
static int x = 0;
std::mutex xMutex;

static std::chrono::milliseconds central_loop_dur(10);
static std::chrono::milliseconds before_central_loop(0);

void f(std::vector<widget>& v) {

    // The first thread acquires the lock.
    // The second thread has an empty vector so passes through before the lock is released.
    if(v.size() > 0) xMutex.lock();
    // The first thread will take about 50ms to write to x.
    // The second thread reads x nearly concurrently with the first thread. Both see 0.
    int r = x;

    // The first thread passes through here.
    // The second thread snoozes.
    std::this_thread::sleep_for(before_central_loop);
    before_central_loop = std::chrono::milliseconds(200);

    for(auto i : v) {
        std::this_thread::sleep_for(central_loop_dur);
        ++r;
    }
    // At 50ms, the first thread writes to x.
    // At 200ms, the second thread obliviously writes a previous value of x back to x.
    x = r;
    if(v.size() > 0) xMutex.unlock();
}

void thread_main(size_t vec_size) {
    std::vector<widget> v(vec_size);
    f(v);
}

int main() {
    std::thread t1(thread_main, 5);
    std::thread t2(thread_main, 0);
    t1.join();
    t2.join();

    std::cout << "The value of x = " << x << std::endl;
}

, , x = 5. ( ) x = 0, 1 5 !

0

Source: https://habr.com/ru/post/1546494/


All Articles