GNU-getline: weird behavior about EOF

Test

To find the behavior of getline() in a collision with EOF, I wrote the following test:

 int main (int argc, char *argv[]) { size_t max = 100; char *buf = malloc(sizeof(char) * 100); size_t len = getline(&buf, &max, stdin); printf("length %zu: %s", len, buf); } 

And input1:

a b c Ctrl-D Enter

Result:

  length 4: abc //notice that '\n' is also taken into consideration and printed 

Input2:

a b c Enter

Exactly the same conclusion:

  length 4: abc 

EOF seems to be missing getline()

Source

So, I found the source code of getline() , and the following is the fragment associated with it (and I leave comments and the codes are irrelevant for brevity):

  while ((c = getc (stream)) != EOF) { /* Push the result in the line. */ (*lineptr)[indx++] = c; /* Bail out. */ if (c == delim) //delim here is '\n' break; } /* Make room for the null character. */ if (indx >= *n) { *lineptr = realloc (*lineptr, *n + line_size); if (*lineptr == NULL) return -1; *n += line_size; } /* Null terminate the buffer. */ (*lineptr)[indx++] = 0; return (c == EOF && (indx - 1) == 0) ? -1 : indx - 1; 

Question

So my question is:

  • why the length here is 4 (as far as I can see it should be 5) (as the wiki says: t be EOF if it is not at the beginning of the line)

A similar question: the behavior of EOF when it is accompanied by other values , but pay attention to getline () in this question is different from GNU-getline

I am using GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2

+5
source share
1 answer

Ctrl-D forces your terminal to flush the input buffer if it is not already flushed. Otherwise, the end of file indicator for the input stream is set. A new line also clears the buffer.

So, you did not close the stream, but only flushed the input buffer, so getline does not see the end of file indicator.

In any of these cases, the EOT character (ASCII 0x04, ^D ) is accepted by getline (in order to do this, you can enter Ctrl-V Ctrl-D ).

A type

a b c Ctrl-D Ctrl-D

or

a b c Enter Ctrl-D

to actually set the end of file indicator.

From POSIX :

Special symbols

  • Eof

A special input character that is recognized if the ICANON flag is ICANON . Upon receipt, all bytes waiting to be read are immediately transferred to the process, without waiting for <newline> , and EOF is discarded. Thus, if there are no wait bytes (that is, EOF occurred at the beginning of the line), a zero byte counter should be returned from read() , which represents an indication of the end of the file. If ICANON set, the EOF character must be discarded during processing.

FYI, the ICANON flag is listed here .

+3
source

Source: https://habr.com/ru/post/1200820/


All Articles