How to recognize a Linux personal device driver when a program using it crashes?

I have a Linux character character driver that creates the /dev/mything , and then a C ++ / Qt program that opens the device and uses it. If this program crashes with exit() , the device closes and the driver correctly resets itself. But if the program exits abnormally, through segfault or SIGINT or something else, the device is not closed properly.

My current workaround is to restart the driver if it is stuck in an β€œopen” state.

This line in the driver tries to prevent the simultaneous use of several programs using the device:

 int mything_open( struct inode* inode, struct file* filp ) { ... if ( port->rings[bufcount].virt_addr ) return -EBUSY; ... } 

Then it will clear:

 int mything_release( struct inode* inode, struct file* filp ) { ... port->rings[bufcount].virt_addr = NULL; ... } 

I think exit() calls mything_release , but SIGINT not. How can I make the driver more reliable in this situation?

EDIT:

Here are the operations that I performed. Maybe I missed something?

 static struct file_operations fatpipe_fops = { .owner = THIS_MODULE, .open = mything_open, .release = mything_release, .read = mything_read, .write = mything_write, .ioctl = mything_ioctl }; 
+6
source share
2 answers

The problem boiled down to this line in mything_release to wait for the write to complete in memory:

 if (wait_event_interruptible_timeout(port->inq, false, 10)) return -ERESTARTSYS; 

With the release of the regular program, it will spin 10 myths and continue moving forward. But with an abnormal exit from SIGINT or something, I think the intermittent timeout was interrupted and it returned -ERESTARTSYS , as a result of which my if returned the same.

What worked for me is just to get rid of if and just wait:

 wait_event_interruptible_timeout(port->inq, false, 10); 

This patch from a few years ago made me believe that returning ERESTARTSYS from the close / _release function is not a good idea: http://us.generation-nt.com/answer/patch-fix-wrong-error-code-interrupted-close -syscalls-help-181191441.html

+1
source

There is no need for this test; the problem is not the abnormal termination of the program (which, from the point of view of your driver, corresponds exactly to the normal close on the device), but instead is a problem in the state of your device. In other words, if you insert close(dev_fd) or even exit(0) at the exact point where your program crashes, you will have the same problem.

You should find out what part of the behavior of your driver causes it to remain in a busy state and fix it.

+2
source

Source: https://habr.com/ru/post/918952/


All Articles