Where to find file system counter information in Mapreduce

During the execution of the mapreduce job, I get the output as follows:

11/09/15 21:35:16 INFO mapreduce.Job: Counters: 24 File System Counters FILE: Number of bytes read=255967 FILE: Number of bytes written=397273 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 Map-Reduce Framework Map input records=5 Map output records=5 Map output bytes=45 ....... 

Here, in the first line, it says: "Counters: 24. Where can I find additional information about these counters.

What interests me the most is read operations = 0 , what is it?
If anyone has any knowledge or links to them, answer.

Thanks.

+4
source share
1 answer

I would recommend you take a look at the Tom White Hadoop book, especially in chapter 8.1, where it gives a detailed list of counters and their values. You can find the online version here .

For large read operations, this corresponds to the number of large file read operations, such as list files in a large folder. It was introduced in HADOOP-6859 , where it is described like this: on the file system, most operations are small, except for listFiles for a large directory. Iterative listFiles was introduced in HDFS to break one large operation into smaller steps. This counter is incremented for each iteration of listFiles when listing files in a large directory.

This ticket also explains some other new counters:

  • read operations - the number of read operations such as listStatus, getFileBlockLocations, opening, etc.
  • write operations - the number of write operations such as create, append, setPermission, etc.

I would advise you to take a look at the FileSystem.Statistics class, which describes some additional file system counters, as described here

+5
source

Source: https://habr.com/ru/post/1482163/


All Articles