I am trying to port a C # program to C ++. The C # program reads a text file of size 1 ~ 5 gb line by line and performs some analysis on each line. C # code is as follows.
using (var f = File.OpenRead(fname)) using (var reader = new StreamReader(f)) while (!reader.EndOfStream) { var line = reader.ReadLine();
For this 1.6 GB file with 7 million lines, this code takes about 18 seconds.
C ++ code that I wrote first for porting as shown below
ifstream f(fname); string line; while (getline(f, line)) {
C ++ code takes about 420 seconds. The second C ++ code I wrote is as follows.
ifstream f(fname); char line[2000]; while (f.getline(line, 2000)) {
C ++ above takes about 85 seconds.
The last code I tried is the c code, as shown below.
FILE *file = fopen ( fname, "r" ); char line[2000]; while (fgets(line, 2000, file) != NULL ) {
The above c code takes about 33 seconds.
Both of the last two codes that parse strings in char [] instead of strings take about 30 seconds to convert char [] to strings.
Is there a way to improve the performance of c / C ++ code for reading a text file line by line to match C # performance? (Added: I am using Windows 7 64-bit OS with VC ++ 10.0, x64)
source share