The confusing result of page error counting in Linux

I wrote programs to count page error times on a linux system. More precisely, the core of time performs the function __do_page_fault .
And somehow I wrote two global variables called pfcount_at_beg and pfcount_at_end , which increase once when the __do_page_fault function __do_page_fault executed in different places of the function.

To illustrate, a modified function has the following form:

 unsigned long pfcount_at_beg = 0; unsigned long pfcount_at_end = 0; static void __kprobes __do_page_fault(...) { struct vm_area_sruct *vma; ... // VARIABLES DEFINITION unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; pfcount_at_beg++; // I add THIS ... ... // ORIGINAL CODE OF THE FUNCTION ... pfcount_at_end++; // I add THIS } 

I expected pfcount_at_end to be less than pfcount_at_beg.

Because, I think, every time the kernel executes pfcount_at_end++ code instructions, it had to execute pfcount_at_beg++ (every function starts from the very beginning of the code).
On the other hand, since there are many conditional return between these two lines of code.

However, the result is the opposite. The value of pfcount_at_end greater than the value of pfcount_at_beg .
I use printk to print these kernel variables through self- syscall . And I wrote a user level program to call system call .

Here is my simple syscall and user program:

 // syscall asmlinkage int sys_mysyscall(void) { printk( KERN_INFO "total pf_at_beg%lu\ntotal pf_at_end%lu\n", pfcount_at_beg, pfcount_at_end) return 0; } // user-level program #include<linux/unistd.h> #include<sys/syscall.h> #define __NR_mysyscall 223 int main() { syscall(__NR_mysyscall); return 0; } 

Is there anyone who knows what exactly happened during this?

Now I have changed the code to make pfcount_at_beg and pfcount_at_end static . However, the result has not changed, i.e. The value of pfcount_at_end greater than the value of pfcount_at_beg . Perhaps this may be due to the intra-atomic increment operation. Would it be better if I used read-write lock?

+5
source share
1 answer

The ++ operator cannot be atomic, so your counters may suffer from simultaneous access and have incorrect values. You must protect your growth as a critical section or use the atomic_t type defined in <asm/atomic.h> , and the associated functions atomic_set() and atomic_add() (and much more).

Not directly related to your problem, but using a specific syscall is excessive (but maybe this is an exercise). An easier solution would be to use the /proc entry (also an interesting exercise).

0
source

Source: https://habr.com/ru/post/1209843/


All Articles