A faster way to move a file than File.Move

I have a console application that will take about 625 days. If there is a way to do it faster.

Firstly, I work in a directory that has about 4,000,000 files, if not more. I work in a database that has a row for each file, and then some.

Now working with SQL is relatively fast, the bottleneck is that I use File.Move() for each move it takes 18 seconds.

Is there a faster way than File.Move() ?

This is the bottleneck:

 File.Move(Path.Combine(location, fileName), Path.Combine(rootDir, fileYear, fileMonth, fileName)); 

All other code is pretty fast. All I have to do is move one file to a new location and then update the database location field.

I can show other code if necessary, but in fact the above is the only current bottleneck.

+6
source share
3 answers

It turns out that switching from File.Move to setting FileInfo and using .MoveTo significantly increased the speed.

It will work after about 35 days, and not after 625 days.

 FileInfo fileinfo = new FileInfo(Path.Combine(location, fileName)); fileinfo.MoveTo(Path.Combine(rootDir, fileYear, fileMonth, fileName)); 
+10
source

18 seconds is really unusual. NTFS does not work well when you have many files in the same directory. When you request a file, it should do a linear search of its directory data structure. With 1000 files, this does not take too much time. With 10,000 files you notice this. With 4 million files., Yes, it takes some time.

Perhaps you can do this even faster if you first load all the directory entries into memory. Then, instead of calling the FileInfo constructor for each file, you simply view it in your dictionary.

Sort of:

 var dirInfo = new DirectoryInfo(path); // get list of all files var files = dirInfo.GetFileSystemInfos(); var cache = new Dictionary<string, FileSystemInfo>(); foreach (var f in files) { cache.Add(f.FullName, f); } 

Now that you get the name from the database, you can just find it in the dictionary. This can be much faster than trying to get it from disk every time.

+2
source

You can move files in parallel as well using Directory.EnumerateFiles gives you a lazy loaded list of files (of course, I have not tested it with 4,000,000 files):

 var numberOfConcurrentMoves = 2; var moves = new List<Task>(); var sourceDirectory = "source-directory"; var destinationDirectory = "destination-directory"; foreach (var filePath in Directory.EnumerateFiles(sourceDirectory)) { var move = new Task(() => { File.Move(filePath, Path.Combine(destinationDirectory, Path.GetFileName(filePath))); //UPDATE DB }, TaskCreationOptions.PreferFairness); move.Start(); moves.Add(move); if (moves.Count >= numberOfConcurrentMoves) { Task.WaitAll(moves.ToArray()); moves.Clear(); } } Task.WaitAll(moves.ToArray()); 
+2
source

Source: https://habr.com/ru/post/954452/


All Articles