Why is the buffer size always equal to an integer equal to 4096 when reading a file line by line?

Code example:

// test.go
package main

import (
    "bufio"
    "os"
)

func main() {
    if len(os.Args) != 2 {
        println("Usage:", os.Args[0], "")
        os.Exit(1)
    }
    fileName := os.Args[1]
    fp, err := os.Open(fileName)
    if err != nil {
        println(err.Error())
        os.Exit(2)
    }
    defer fp.Close()
    r := bufio.NewScanner(fp)
    var lines []string
    for r.Scan() {
        lines = append(lines, r.Text())
    }
}

c: \> go build test.go

c: \> test.exe test.txt

Then I controlled its process using the process monitor during its execution, part of the result:

test.exe  ReadFile  SUCCESS      Offset: 4,692,375, Length: 8,056
test.exe  ReadFile  SUCCESS      Offset: 4,700,431, Length: 7,198
test.exe  ReadFile  SUCCESS      Offset: 4,707,629, Length: 8,134
test.exe  ReadFile  SUCCESS      Offset: 4,715,763, Length: 7,361
test.exe  ReadFile  SUCCESS      Offset: 4,723,124, Length: 8,056
test.exe  ReadFile  SUCCESS      Offset: 4,731,180, Length: 4,322
test.exe  ReadFile  END OF FILE  Offset: 4,735,502, Length: 8,192

Equivalent Java code:

//Test.java
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStreamReader;

public class Test{
public static void main(String[] args) {
  try
  {
  FileInputStream in = new FileInputStream("test.txt");
  BufferedReader br = new BufferedReader(new InputStreamReader(in));
  String strLine;
  while((strLine = br.readLine())!= null)
  {
   ;
  }
  }catch(Exception e){
   System.out.println(e);
  }
 }
}

c: \> javac Test.java

c: \> java Test

Then part of the monitoring output:

java.exe  ReadFile  SUCCESS       Offset: 4,694,016, Length: 8,192
java.exe  ReadFile  SUCCESS       Offset: 4,702,208, Length: 8,192
java.exe  ReadFile  SUCCESS       Offset: 4,710,400, Length: 8,192
java.exe  ReadFile  SUCCESS       Offset: 4,718,592, Length: 8,192
java.exe  ReadFile  SUCCESS       Offset: 4,726,784, Length: 8,192
java.exe  ReadFile  SUCCESS       Offset: 4,734,976, Length: 526
java.exe  ReadFile  END OF FILE   Offset: 4,735,502, Length: 8,192

As you can see, the buffer size in java is 8192, and each time it reads 8192 bytes. Why does the Length in Go variable change every time a file is read?

I tried bufio.ReadString('\n'), bufio.ReadBytes('\n')and both of them have the same problem.

[Update] I checked the sample in C,

//test.c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
        FILE * fp;
        char * line = NULL;
        size_t len = 0;
        ssize_t read;
        fp = fopen("test.txt", "r");
        if (fp == NULL)
                exit(EXIT_FAILURE);
        while ((read = getline(&line, &len, fp)) != -1) {
                printf("Retrieved line of length %zu :\n", read);
        }
        if (line)
                free(line);
        return EXIT_SUCCESS;
}

The result is similar to java code (buffer size 65536 on my system). So why is Go so different here?

+4
2

bufio.Scan , , 4096, , "" , :

n, err := s.r.Read(s.buf[s.end:len(s.buf)])

, , , , , .

+2

:

, , Scan .

Go (http://golang.org/pkg/bufio/#Scanner.Scan):

bufio.ReadString('\n') bufio.ReadBytes('\n') - \n.

, - 4096 READFILE.

, , , IO, bufio.

+1

Source: https://habr.com/ru/post/1547319/


All Articles