I am doing a performance evaluation between Windows CE and Linux on the arm imx27 panel. The code has already been written for CE and measures the time it takes to make various kernel calls, such as using OS primitives such as mutexes and semaphores, opening and closing files, and creating networks.
During my porting of this application to Linux (pthreads), I came across a problem that I cannot explain. Almost all tests showed a performance increase from 5 to 10 times, but not my version of win32 events ( SetEvent and WaitForSingleObject ), CE actually “won” this test.
To emulate the behavior, I used pthreads condition variables (I know that my implementation does not fully emulate the CE version, but this is enough for evaluation).
The test code uses two streams that ping pong use events with each other.
Windows code:
Theme 1: (measured flow)
HANDLE hEvt1, hEvt2; hEvt1 = CreateEvent(NULL, FALSE, FALSE, TEXT("MyLocEvt1")); hEvt2 = CreateEvent(NULL, FALSE, FALSE, TEXT("MyLocEvt2")); ResetEvent(hEvt1); ResetEvent(hEvt2); for (i = 0; i < 10000; i++) { SetEvent (hEvt1); WaitForSingleObject(hEvt2, INFINITE); }
Topic 2: (just “answers”)
while (1) { WaitForSingleObject(hEvt1, INFINITE); SetEvent(hEvt2); }
Linux Code:
Theme 1: (measured flow)
struct event_flag *event1, *event2; event1 = eventflag_create(); event2 = eventflag_create(); for (i = 0; i < 10000; i++) { eventflag_set(event1); eventflag_wait(event2); }
Topic 2: (just “answers”)
while (1) { eventflag_wait(event1); eventflag_set(event2); }
My implementation of eventflag_* :
struct event_flag* eventflag_create() { struct event_flag* ev; ev = (struct event_flag*) malloc(sizeof(struct event_flag)); pthread_mutex_init(&ev->mutex, NULL); pthread_cond_init(&ev->condition, NULL); ev->flag = 0; return ev; } void eventflag_wait(struct event_flag* ev) { pthread_mutex_lock(&ev->mutex); while (!ev->flag) pthread_cond_wait(&ev->condition, &ev->mutex); ev->flag = 0; pthread_mutex_unlock(&ev->mutex); } void eventflag_set(struct event_flag* ev) { pthread_mutex_lock(&ev->mutex); ev->flag = 1; pthread_cond_signal(&ev->condition); pthread_mutex_unlock(&ev->mutex); }
And struct :
struct event_flag { pthread_mutex_t mutex; pthread_cond_t condition; unsigned int flag; };
Questions:
- Why can't I see performance improvements here?
- What can be done to improve performance (for example, are there faster ways to implement CE behavior)?
- I'm not used to coding pthreads, are there any errors in my implementation, possibly as a result of performance loss?
- Are there any alternative libraries for this?