I have an application that works great for processing files that fall into a directory on my server. Process:
1) check for files in a directory 2) queue a user work item to handle each file in the background 3) wait until all workers have completed 4) goto 1
This works well, and I never worry about the same file being processed twice, or multiple threads created for the same file. However, if there is one file that takes too much time to process, step # 3 hangs in this one file and holds all other processing.
So my question is: which correct paradigm generates exactly one thread for each file that I need to process, and not block if one file takes too much time? I considered FileSystemWatcher, but the files may not be readable, so I constantly look at all the files and start the process for everyone (which will exit immediately if the file is locked).
Should I delete step # 3 and save the list of files that I have already processed? This seems messy, and over time the list will be very large, so I suspect there is a more elegant solution.
source share