EOF reached the end of the file

I am making a program for a school where I have a multiprocessing program where each process reads a part of a file and they work together to count the number of words in a file. I have a problem: if there are more than two processes, all processes read the EOF from the file before they read their part of the file. Here is the relevant code:

#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> int main(int argc, char *argv[]) { FILE *input_textfile = NULL; char input_word[1024]; int num_processes = 0; int proc_num = 0; //The index of this process (used after forking) long file_size = -1; input_textfile = fopen(argv[1], "r"); num_processes = atoi(argv[2]); //...Normally error checking would go here if (num_processes > 1) { //...create space for pipes for (proc_num = 0; proc_num < num_processes - 1; proc_num++) { //...create pipes pid_t proc = fork(); if (proc == -1) { fprintf(stderr,"Could not fork process index %d", proc_num); perror(""); return 1; } else if (proc == 0) { break; } //...link up the pipes } } //This code taken from http://stackoverflow.com/questions/238603/how-can-i-get-a-files-size-in-c //Interestingly, it also fixes a bug we had where the child would start reading at an unpredictable place //No idea why, but apparently the offset wasn't guarenteed to start at 0 for some reason fseek(input_textfile, 0L, SEEK_END); file_size = ftell(input_textfile); fseek(input_textfile, proc_num * (1.0 * file_size / num_processes), 0); //read all words from the file and add them to the linked list if (file_size != 0) { //Explaination of this mess of a while loop: // if we're a child process (proc_num < num_processes - 1), then loop until we make it to where the next // process would start (the ftell part) // if we're the parent (proc_num == num_processes - 1), loop until we reach the end of the file while ((proc_num < num_processes - 1 && ftell(input_textfile) < (proc_num + 1) * (1.0 * file_size / num_processes)) || (proc_num == num_processes - 1 && ftell(input_textfile) < file_size)){ int res = fscanf(input_textfile, "%s", input_word); if (res == 1) { //count the word } else if (res == EOF && errno != 0) { perror("Error reading file: "); exit(1); } else if (res == EOF && ftell(input_textfile) < file_size) { printf("Process %d found unexpected EOF at %ld.\n", proc_num, ftell(input_textfile)); exit(1); } else if (res == EOF && feof(input_textfile)){ continue; } else { printf("Scanf returned unexpected value: %d\n", res); exit(1); } } } //don't get here anyway, so no point in closing files and whatnot return 0; } 

Output when starting a file with three processes:

 All files opened successfully Process 2 found unexpected EOF at 1323008. Process 1 found unexpected EOF at 823849. Process 0 found unexpected EOF at 331776. 

Test file that causes the error: https://dl.dropboxusercontent.com/u/16835571/test34.txt

Compile with:

 gcc main.c -o wordc-mp 

and run as:

 wordc-mp test34.txt 3 

It is worth noting that only this particular file gives me problems, but the error offsets continue to change, so this is not the contents of the file.

+5
source share
1 answer

Before forking, you created your file descriptor. The child process inherits a file descriptor that points to the same description of the parent's file , and thus advancement with one of the children makes the cursor in advance for all children.

From the "man fork" you can get confirmation:

  • A child process is created by a single thread - which is called fork (). The entire virtual address space of the parent is replicated in the child, including mutex states, condition variables, and other pthreads objects; Using pthread_atfork (3) may be helpful in resolving issues that may arise.

  • The child inherits copies of the parent set of the open descrip-TORs file. Each file descriptor in the child element refers to the same open file description (see open (2)) as the corresponding file descriptor in the parent. This means that two descriptors share the open file status flags, the current file offset, and signal-controlled I / O attributes (see the description of F_SETOWN and F_SETSIG in the fcntl (2) file).

+3
source

Source: https://habr.com/ru/post/1245263/


All Articles