Tuesday, 15 April 2025

File Descriptors and File Deletion

In my previous post about inodes I mentioned that I would write a further post about the other big element regarding how the OS manages files, File Descriptors. So that's what this post is about. I already mentioned in that post that a process references the files that it has open by inode, not by path. It does so through File Descriptors. When a process opens a file it gets a File Descriptor, that works like this:

file descriptors index into a per-process file descriptor table maintained by the kernel, that in turn indexes into a system-wide table of files opened by all processes, called the file table. This table records the mode with which the file (or other resource) has been opened: for reading, writing, appending, and possibly other modes. It also indexes into a third table called the inode table that describes the actual underlying files.[3]

From ChatGPT we learn:

An entry in the system-wide File Table does not contain the original path that was used to open the file. Instead, it primarily contains:

A reference to the inode (which uniquely identifies the file on disk).
The file access mode (read, write, etc.).
The file offset (current read/write position in the file).
A reference count (number of processes using the same file table entry).

And if you are wondering why the path is not stored in the File Table:

Why is the original path not stored?

The kernel tracks it by its inode, not by the path. This is because:

Files can have multiple hard links (multiple paths referring to the same inode).
A file can be renamed or deleted while it is still open, making the original path potentially invalid.
Efficiency—storing the path for every open file would add overhead.

We know also that the Inode table does not contain information about the paths that point to an inode (but it contains a counter of how many paths (hardlinks) reference that inode, the link count). Notice that the inode table is stored on disk (otherwise filesystems would be ephemeral), typically at the beginning of its corresponding partition, and also in memory. When the filesystem is mounted, the inode table is loaded into memory to facilitate faster access and manipulation of file metadata. The thing is that when we use the lsof command to see the files open by a process we can see paths to those files (not just inode numbers), so how does that work? Chat GPT says:

When you run lsof -p PID, it reconstructs the paths based on available information by:

Checking symbolic links in /proc/PID/fd/ (e.g., /proc/1234/fd/3 might show a path like /home/user/file.txt).
Looking at /proc/PID/maps (for memory-mapped files).
Using /proc/PID/cwd (for relative paths).
In some cases, traversing /proc/mounts to help resolve paths.
However, this is an approximation because the exact original path used by open() may not always be retrievable.

When we delete a file (I mean: rm myFile.txt) that hard link will be removed, but there are 2 things that will prevent or postpone the deletion of the corresponding inode (and the datablocks referenced from it). Obviously, if there are other hardlinks for that file the inode will not be deleted. If no other hardlinks exist, the OS will check if any process has the file open (by searching in the File Table any entry pointing to that inode), and in that case, will wait. This is nicely explained again by ChatGPT:

When a file is deleted (unlink() in Linux/Unix), the OS follows this process:

Check the link count in the inode:
If there are other hard links (i.e., link count > 1), the directory entry is removed, but the inode remains because other names still reference it.
If there are no other hard links (link count == 1), then the inode might be deleted—but not yet.
Check the system-wide File Table:
If any process still has the file open, the inode and data remain in use until all processes close it.
The file is effectively invisible to new processes (since its directory entry is gone), but existing processes can still read/write to it.
Final cleanup (when the last process closes the file):
If the link count was already 0 and no file descriptors remain open, the inode and disk blocks are finally freed.

No comments:

Post a Comment