Reading parts of large files from disk

I work with large files in C # (maybe up to 20% -40% of available memory), and I will need only small parts of the files that will be loaded into memory at a time (for example, 1-2% of the file). I thought using a FileStream would be a better option, but idk. I will need to specify the starting point (in bytes) and length (in bytes) and copy this area into bytes []. Access to the file may be required for sharing between streams and will be randomly in the file (non-linear access). I also need it to be fast.

There are already unsafe methods in the project, so feel free to suggest things from the more dangerous side of C #

+4
source share
3 answers

A FileStream will let you search for the part of the file you want without problems. This is the recommended way to do this in C #, and it is fast.

Exchange between threads: you need to create a lock so that other threads do not change the position of the FileStream while you are trying to read it. The easiest way to do this:

 // This really needs to be a member-level variable; private static readonly object fsLock = new object(); // Instantiate this in a static constructor or initialize() method private static FileStream fs = new FileStream("myFile.txt", FileMode.Open); public string ReadFile(int fileOffset) { byte[] buffer = new byte[bufferSize]; int arrayOffset = 0; lock (fsLock) { fs.Seek(fileOffset, SeekOrigin.Begin); int numBytesRead = fs.Read(bytes, arrayOffset , bufferSize); // Typically used if you're in a loop, reading blocks at a time arrayOffset += numBytesRead; } // Do what you want to the byte array and return it } 

If necessary, add try..catch and other code. Everywhere you access this file stream, put a lock on a member level fsLock level variable ... this will allow other methods to read / manipulate the file pointer while reading.

Most likely, I think you will find that you are limited by the speed of access to the disk, not the code.

You will have to think about all the problems associated with multi-threaded file access ... who initializes / opens the file, which closes it, etc. There is a lot of space to cover.

+5
source

I don't know anything about the structure of these files, but reading part of a file from FileStream or the like sounds like the best and fastest way to do this.

You will not need to copy byte [], since FileStream can read directly into byte array.

It looks like you can learn more about the file structure, which may also call additional methods. But if you need to read only part of the file, then probably this will be the way to do it.

+1
source

If you are using .Net 4, browse the memory mapped files in the System.IO.MemoryMappedFiles namespace.

They are ideal for reading small fragments from large files. The MSDN documentation is .

You can also do this in earlier versions of .Net, but then you need to wrap the Win32 API (or use http://winterdom.com/dev/net ),

0
source

Source: https://habr.com/ru/post/1342479/


All Articles