Code example:
package main
import (
"bufio"
"os"
)
func main() {
if len(os.Args) != 2 {
println("Usage:", os.Args[0], "")
os.Exit(1)
}
fileName := os.Args[1]
fp, err := os.Open(fileName)
if err != nil {
println(err.Error())
os.Exit(2)
}
defer fp.Close()
r := bufio.NewScanner(fp)
var lines []string
for r.Scan() {
lines = append(lines, r.Text())
}
}
c: \> go build test.go
c: \> test.exe test.txt
Then I controlled its process using the process monitor during its execution, part of the result:
test.exe ReadFile SUCCESS Offset: 4,692,375, Length: 8,056
test.exe ReadFile SUCCESS Offset: 4,700,431, Length: 7,198
test.exe ReadFile SUCCESS Offset: 4,707,629, Length: 8,134
test.exe ReadFile SUCCESS Offset: 4,715,763, Length: 7,361
test.exe ReadFile SUCCESS Offset: 4,723,124, Length: 8,056
test.exe ReadFile SUCCESS Offset: 4,731,180, Length: 4,322
test.exe ReadFile END OF FILE Offset: 4,735,502, Length: 8,192
Equivalent Java code:
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStreamReader;
public class Test{
public static void main(String[] args) {
try
{
FileInputStream in = new FileInputStream("test.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
while((strLine = br.readLine())!= null)
{
;
}
}catch(Exception e){
System.out.println(e);
}
}
}
c: \> javac Test.java
c: \> java Test
Then part of the monitoring output:
java.exe ReadFile SUCCESS Offset: 4,694,016, Length: 8,192
java.exe ReadFile SUCCESS Offset: 4,702,208, Length: 8,192
java.exe ReadFile SUCCESS Offset: 4,710,400, Length: 8,192
java.exe ReadFile SUCCESS Offset: 4,718,592, Length: 8,192
java.exe ReadFile SUCCESS Offset: 4,726,784, Length: 8,192
java.exe ReadFile SUCCESS Offset: 4,734,976, Length: 526
java.exe ReadFile END OF FILE Offset: 4,735,502, Length: 8,192
As you can see, the buffer size in java is 8192, and each time it reads 8192 bytes. Why does the Length in Go variable change every time a file is read?
I tried bufio.ReadString('\n'), bufio.ReadBytes('\n')and both of them have the same problem.
[Update]
I checked the sample in C,
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE * fp;
char * line = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("test.txt", "r");
if (fp == NULL)
exit(EXIT_FAILURE);
while ((read = getline(&line, &len, fp)) != -1) {
printf("Retrieved line of length %zu :\n", read);
}
if (line)
free(line);
return EXIT_SUCCESS;
}
The result is similar to java code (buffer size 65536 on my system). So why is Go so different here?