I am currently working on a project where I have a large text file (15+ GB) and I am trying to run a function on every line of the file. To speed up the task, I create 4 threads and try to get them to read the file at the same time. This is similar to what I have:
#include <stdio.h> #include <string> #include <iostream> #include <stdlib.h> #include <thread> #include <fstream> void simpleFunction(*wordlist){ string word; getline(*wordlist, word); cout << word << endl; } int main(){ int max_concurrant_threads = 4; ifstream wordlist("filename.txt"); thread all_threads[max_concurrant_threads]; for(int i = 0; i < max_concurrant_threads; i++){ all_threads[i] = thread(simpleFunction,&wordlist); } for (int i = 0; i < max_concurrant_threads; ++i) { all_threads[i].join(); } return 0; }
The getline function (along with "* wordlist β word") seems to increment the pointer and read the value in 2 steps, since I will regularly receive:
Item1 Item2 Item3 Item2
back.
So, I was wondering if there is a way to atomically read a file line? Loading it into an array will not work at first because the file is too large, and I would prefer not to load the file into chunks at a time.
I could not find anything about fstream and getline atomicity sadly. If there is an atomic version of readline or even an easy way to use locks to achieve what I want, Iβm all ears.
Thanks in advance!
source share