C ++ reads text in turn, speed / efficiency required

I have a series of large text files (10s - 100s thousand lines) that I want to parse a line. The idea is to check if the string has a specific word / character / phrase and is currently writing to an additional file if this happens.

The code I've used so far:

ifstream infile1("c:/test/test.txt"); while (getline(infile1, line)) { if (line.empty()) continue; if (line.find("mystring") != std::string::npos) { outfile1 << line << '\n'; } } 

The ultimate goal is to write these lines to the database. I thought to write them to a file first and then import the file.

The problem that I encountered is the time taken to complete the task. I try to minimize time as much as possible, so any suggestions regarding saving time on the read / write scripts above would be most welcome. Sorry if something is obvious, I just started to switch to C ++.

thanks

EDIT

I have to say that I am using VS2015

EDIT 2

So, it was my own mistake, when I switched to Release and changed the type of architecture, I had a noticeable speed. Thanks to everyone who pointed me in this direction. I also look at mmap material and this is also useful. Thanks guys!

+5
source share
2 answers

When you use ifstream to read and process to / from really large files, you need to increase the default buffer size that is used (usually 512 bytes).

The best buffer size depends on your needs, but as a hint you can use the block size of the file (s) section for reading / writing. You can use many tools or even code to know this information.

Windows example:

 fsutil fsinfo ntfsinfo c: 

Now you need to create a new buffer for ifstream as follows:

 size_t newBufferSize = 4 * 1024; // 4K char * newBuffer = new char[newBufferSize]; ifstream infile1; infile1.rdbuf()->pubsetbuf(newBuffer, newBufferSize); infile1.open("c:/test/test.txt"); while (getline(infile1, line)) { /* ... */ } delete newBuffer; 

Do the same with the output stream and do not forget to set a new buffer before the open file, otherwise it may not work.

You can play with the values ​​to find the best size for you.

You will notice the difference.

0
source

C-style I / O functions are much faster than fstream. You can use fgets / fputs to read / write each text line.

0
source

Source: https://habr.com/ru/post/1235788/


All Articles