The used space calculated by statvfs () for the file system is larger than the sum of the sizes of all files in fs

I have a small 50MiB partition formatted as ext4 with only one directory that contains a set of photos mounted on / mnt / tmp.

Then I use statvfs() to calculate the bytes used in the section and lstat() to calculate the size of each file inside, for this I wrote this program:

 #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <sys/statvfs.h> #include <stdint.h> #include <string.h> #include <dirent.h> #include <stdlib.h> //The amount of bytes of all files found uint64_t totalFilesSize=0; //Size for a sector in the fs unsigned int sectorSize=0; void readDir(char *path) { DIR *directory; struct dirent *d_file; // a file in *directory directory = opendir (path); while ((d_file = readdir (directory)) != 0) { struct stat filestat; char *abPath=malloc(1024); memset(abPath, 0, 1024); strcpy(abPath, path); strcat(abPath, "/"); strcat(abPath, d_file->d_name); lstat (abPath, &filestat); switch (filestat.st_mode & S_IFMT) { case S_IFDIR: { if (strcmp (".", d_file->d_name) && strcmp ("..", d_file->d_name)) { printf("File: %s\nSize: %d\n\n", abPath, filestat.st_size); //Add slack space to the final sum int slack=sectorSize-(filestat.st_size%sectorSize); totalFilesSize+=filestat.st_size+slack; readDir(abPath); } break; } case S_IFREG: { printf("File: %s\nSize: %d\n\n", abPath, filestat.st_size); //Add slack space to the final sum int slack=sectorSize-(filestat.st_size%sectorSize); totalFilesSize+=filestat.st_size+slack; break; } } free(abPath); } closedir (directory); } int main (int argc, char **argv) { if(argc!=2) { printf("Error: Missing required parameter.\n"); return -1; } struct statvfs info; statvfs (argv[1], &info); sectorSize=info.f_bsize; //Setting global variable uint64_t usedBytes=(info.f_blocks-info.f_bfree)*info.f_bsize; readDir(argv[1]); printf("Total blocks: %d\nFree blocks: %d\nSize of block: %d\n\ Size in bytes: %d\nTotal Files size: %d\n", info.f_blocks, info.f_bfree, info.f_bsize, usedBytes, totalFilesSize); return 0; } 

Passing the mount point of the partition as a parameter (/ mnt / tmp), the program displays this output:

 File: /mnt/tmp/lost+found Size: 12288 File: /mnt/tmp/photos Size: 1024 File: /mnt/tmp/photos/IMG_3195.JPG Size: 2373510 File: /mnt/tmp/photos/IMG_3200.JPG Size: 2313695 File: /mnt/tmp/photos/IMG_3199.JPG Size: 2484189 File: /mnt/tmp/photos/IMG_3203.JPG Size: 2494687 File: /mnt/tmp/photos/IMG_3197.JPG Size: 2259056 File: /mnt/tmp/photos/IMG_3201.JPG Size: 2505596 File: /mnt/tmp/photos/IMG_3202.JPG Size: 2306304 File: /mnt/tmp/photos/IMG_3204.JPG Size: 2173883 File: /mnt/tmp/photos/IMG_3198.JPG Size: 2390122 File: /mnt/tmp/photos/IMG_3196.JPG Size: 2469315 Total blocks: 47249 Free blocks: 19160 Size of block: 1024 Size in bytes: 28763136 Total Files size: 23790592 

Pay attention to the last two lines. In the FAT32 file system, the sum is the same, but in ext4 it is different.

So the question is: why?

+4
source share
2 answers

statvfs() - file system level operation. Used space will be calculated from the point of view of the file system. Therefore:

  • It will contain any file system structure: for file systems based on the traditional Unix design, which includes inodes and any indirect blocks .

    On some of my systems, I usually have a 256-byte index of 32 KB of space for the root partition. Smaller partitions may have an even larger inode density to provide enough descriptors for a large number of files. I believe the default value of mke2fs is one index per 16 KB of space.

    Creating a 850 MB Ext4 file system with default parameters results in a file system containing about 54,000 indexes that consume more than 13 MB of space.

  • For Ext3 / Ext4, which will also include a log that has a minimum size of 1024 file system blocks. For a total block size of 4 KB, which is at least 4 MB per file system.

    The 850 MB Ext4 file system will have a 16 MB log by default.

  • The result from statvfs() will also contain any deleted, but still open files - this often happens in directories with tmp cases for use by applications.

  • To see the actual space used by the file with lstat() , you need to use the st_blocks stat field and multiply by 512. Judging by the sizes displayed in your program output, you use the st_size field, which is the exact file size in bytes. Usually this will be less than the actual space - a 5KB file actually uses 8KB in a file system with 4KB blocks.

    Conversely, a sparse file will use fewer blocks than the one indicated by its file size.

Thus, the extra use of space mentioned above will add to quite noticeable amounts explaining the mismatch that you see.

EDIT:

  • I just noticed that your program uses empty space. Although this is not the recommended way to calculate the actual used space (as opposed to seeming), it works, so you run out of space. On the other hand, you do not have enough space used for the root directory of the file system, although it will probably be just one block or two :-)

  • You might want to see the result of tune2fs -l /dev/xxx . It lists several relevant numbers, including the space reserved for file system metadata.

By the way, most of the functions in your program can be performed using df and du :

 # du -a --block-size=1 mnt/ 2379776 mnt/img0.jpg 3441664 mnt/img1.jpg 2124800 mnt/img2.jpg 12288 mnt/lost+found 7959552 mnt/ # df -B1 mnt/ Filesystem 1B-blocks Used Available Use% Mounted on /dev/loop0 50763776 12969984 35172352 27% /tmp/mnt 

By the way, the above Ext4 test file system was created using the default mkfs options in a 50 MB image file. It has a block size of 1024 bytes, 12,824 128-byte inodes that consume 1,603 KB and a log of 4,096 blocks, which uses 4,096 KB. Another 199 blocks are reserved for group descriptor tables according to tune2fs .

+5
source

Indydes are probably not counted, and they may contain some small data.

If the file is sparse, its size is larger than actually occupied.

If the file is hard-linked more than once, the shared inode is split.

An article about Ext4 here by Kumar et al

+3
source

Source: https://habr.com/ru/post/1388730/


All Articles