getdents
will return struct linux_dirent
. He will do this for any basic type of file system. The “on disk” format may be completely different, known only to this file system driver, so a simple read call in user space cannot work. That is, getdents
can convert from native format to populate linux_dirent
.
could not say the same thing about reading bytes from a file using read ()? The data format on disk in file format is not necessarily uniform for file systems or even shifted on disk - thus reading a few bytes from disk will again be something that I expect will be delegated to the file system driver.
Inconsistent file data processed by the VFS level ["virtual filesystem"]. Regardless of how FS chooses to organize the list of blocks for the file (for example, ext4 uses the nodes "inodes": "index" or "information", they use the organization "ISAM" ("sequential access method"). MS / DOS FS may have a completely different organization).
Each FS driver registers a VFS function callback table when it starts. For this operation (for example, open/close/read/write/seek
) in the table there is a corresponding record.
The VFS layer (that is, from the user space system call) will “call” the FS driver, and the FS driver will perform the operation, performing whatever it considers necessary to complete the request.
I assume that the FS driver knows about the location of the data inside a regular file on disk - even if the data was fragmented.
Yes. For example, if a read request is to read the first three blocks from a file (for example, 0,1,2), FS will look for indexing information for the file and get a list of physical blocks for reading (for example, 1,000,000, 200.37) from the surface of the disk. All this is handled transparently in the FS driver.
A user space program will simply see that its buffer is filled with the correct data, regardless of how complex FS indexing and block fetching are.
It may be [more] more correct to refer to this as inode data transfer, since there are inodes files for files (that is, inode has indexing information to “scatter / collect” FS blocks for the file). But the FS driver also uses this internally to read from the directory. That is, each directory has an index index to track indexing information for that directory.
So, for the FS driver, the directory is like a flat file that has specially formatted information. These are reference "records". This is what getdents
returns. It sits on top of the index.
Directory entries can be of variable length [based on the length of the file name]. Thus, the format on the disk will be (name it "Type A"):
static part|variable length name static part|variable length name ...
But ... some FSES are organized differently (call it "Type B"):
<static1>,<static2>... <variable1>,<variable2>,...
Thus, an organization of type A can be read atomically by calling the user space read(2)
, type B will be difficult. So calling getdents
VFS handles this.
can't VFS also represent the "linux_dirent" kind of directory, such as VFS, is a "flat view" of a file?
This is what getdents
are for.
And again, I assume that the FS driver knows the type of each file and thus can return linux_dirent when read () is called in a directory, and not in a series of bytes.
getdents
did not always exist. When the dimension of the hard drives was fixed and there was only one FS format, the call to readdir(3)
probably made read(2)
under it and received a series of bytes [this is just what read(2)
provides.) Actually, IIRC, at the beginning there were only readdir(2)
and getdents
and readdir(3)
did not exist.
But what do you do if read(2)
is "short" (for example, two bytes are too small)? How do you communicate with this app?
My question is more similar, since the FS driver can determine if the file is a directory or a regular file (and I assume that it can), and since it should intercept all read () calls at any time, why isn’t read () in directory implemented as reading linux_dirent?
read
in the directory is not intercepted and converted to getdents
, because the OS is minimalistic. He expects you to know the difference and make the appropriate syscall.
You do open(2)
for files or dirs [ opendir(3)
is a wrapper and open(2)
bottom]. You can read / write / search files and search / receive for dirs.
But ... we do read
for EISDIR
returns. [Note: I forgot this in my original comments]. In the simple “flat data” model that it provides, there is no way to transfer / control everything that getdents
can / does.
Thus, instead of allowing a more incomplete way to get partial / incorrect information, it is easier for the kernel and application developer to go through the getdents
interface.
In addition, getdents
do things atomically. If you read the directory entries in this program, there may be other programs that create and delete files in this directory or rename them - right in the middle of the getdents
sequence.
getdents
present an atomic view. Either the file exists or not. It has been renamed or not. This way, you do not get a “partially modified” view, no matter how much “turmoil” occurs around you. When you ask getdents
for 20 entries, you will get them [or 10 if there are only a lot of them].
Side Note: A useful trick is to “exceed” the score. That is, tell getdents
that you want 50,000 entries [you must provide a space]. Usually you return about 100 or so. But now you have an atomic snapshot in time for a complete catalog. I sometimes do this instead of a cycle with a count of 1 - YMMV. You still need to protect against immediate disappearance, but at least you can see it (i.e. subsequent file failure)
So, you always get “whole” records and a record for the file you just deleted. This does not mean that the file still exists, simply because it was there during getdents
. Another process can instantly erase it, but not in the middle of getdents
If read(2)
enabled, you will need to guess how much data to read and not know which records were fully generated in a partial state. If FS had an organization of type B above, one read could not atomically get the static part and the variable part in one step.
It would be philosophically wrong to slow down read(2)
to do what getdents
.
getdents
, unlink
, creat
, rmdir
and rename
(etc.) operations are blocked and serialized to prevent any inconsistencies [not to mention FS corruption or FS leaks / lost blocks]. In other words, these system calls all "know each other."
If pgmA renames "x" to "z" and pgmB renames "y" to "z", they do not collide. One goes first and second, but no FS blocks are ever lost / leak. getdents
gets the whole view (whether it is "xy", "yz", "xz" or "z"), but he will never see "xyz" at the same time.