Prior to this change, both uid_t and gid_t were typedef'ed to `u32`.
This made it easy to use them interchangeably. Let's not allow that.
This patch adds UserID and GroupID using the AK::DistinctNumeric
mechanism we've already been employing for pid_t/ProcessID.
We often get queried for the root inode, and it will always be cached
in memory anyway, so let's make Ext2FS::root_inode() fast by keeping
the root inode in a dedicated member variable.
This fixes#8133.
Ext2FSInode::remove_child() searches the lookup cache, so if it's not
initialized, removing the child fails. If the child was a directory,
this led to it being corrupted and having 0 children.
I also added populate_lookup_cache to add_child. I hadn't seen any
bugs there, but if the cache wasn't populated before, adding that
one entry would make it think it was populated, so that would cause
bugs later.
The error handling in all these cases was still using the old style
negative values to indicate errors. We have a nicer solution for this
now with KResultOr<T>. This change switches the interface and then all
implementers to use the new style.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
Since the inode is the logical owner of its block list, let's move the
code that computes the block list there, and also stop hogging the FS
lock while we compute the block list, as there is no need for it.
There are two locks in the Ext2FS implementation:
* The FS lock (Ext2FS::m_lock)
This governs access to the superblock, block group descriptors,
and the block & inode bitmap blocks. It's held while allocating
or freeing blocks/inodes.
* The inode lock (Ext2FSInode::m_lock)
This governs access to the inode metadata, including the block
list, and to the content data as well. It's held while doing
basically anything with the inode.
Once an on-disk block/inode is allocated, it logically belongs
to the in-memory Inode object, so there's no need for the FS lock
to be taken while manipulating them, the inode lock is all you need.
This dramatically reduces the impact of disk I/O on path resolution
and various other things that look at individual inodes.
This patch combines inode the scan for an available inode with the
updating of the bit in the inode bitmap into a single operation.
We also exit the scan immediately when we find an inode, instead of
continuing until we've scanned all the eligible groups(!)
Finally, we stop holding the filesystem lock throughout the entire
operation, and instead only take it while actually necessary
(during inode allocation, flush, and inode cache update.)
Improve a bunch of situations where we'd previously panic the kernel
on failure. We now propagate whatever error we had instead. Usually
that'll be EIO.
Both inode and block allocation operate on bitmap blocks and update
counters in the superblock and group descriptor.
Since we're here, also add some error propagation around this code.
The way we read/write directories is very inefficient, and this doesn't
solve any of that. It does however reduce memory usage of directory
entry vectors by 25% which has nice immediate benefits.
We had two ways of creating a new Ext2FS inode. Either they were empty,
or they started with some pre-allocated size.
In practice, the pre-sizing code path was only used for new directories
and it didn't actually improve anything as far as I can tell.
This patch simplifies inode creation by simply always allocating empty
inodes. Block allocation and block list generation now always happens
on the same code path.
..and allow implicit creation of KResult and KResultOr from ErrnoCode.
This means that kernel functions that return those types can finally
do "return EINVAL;" and it will just work.
There's a handful of functions that still deal with signed integers
that should be converted to return KResults.
When computing the list of blocks to deallocate when freeing an inode,
we would stop collecting blocks after reaching the inode's block count.
Since we're getting rid of the inode, we need to also include the meta
blocks used by the on-disk block list itself.
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
A change introduced in 5e01234 made it the resposibility of each
filesystem to have the file types returned from
'traverse_as_directory' match up with the DT_* types.
However, this caused corruption of the Ext2FS file format because
the Ext2FS uses 'traverse_as_directory' internally when manipulating
the file system. The result was a mixture between EXT2_FT_* and DT_*
file types in the internal Ext2FS structures.
Starting with this commit, the conversion from internal filesystem file
types to the user facing DT_* types happens at a later stage,
in the 'FileDescription::get_dir_entries' function which is directly
used by sys$get_dir_entries.
This fixes an issue we had in the git port where git would not
recognize untracked files (for example in 'git status').
When git used readdir, the 'd_type' field in the dirent struct contained
bad values (Specifically, it contained the values defiend in
Kernel/FileSystem/ext2_fs.h instead of the ones in LibC/dirent.h).
After this fix, we can create a new git repository with 'git init', and
then stage and commit files as usual.
Unlike DirectoryEntry (which is used when constructing directories),
DirectoryEntryView does not manage storage for file names. Names are
just StringViews.
This is much more suited to the directory traversal API and makes
it easier to implement this in file system classes since they no
longer need to create temporary name copies while traversing.
Certain implementations of Inode::directory_entry_count were calling
functions which returned errors, but had no way of surfacing them.
Switch the return type to KResultOr<size_t> and start observing these
error paths.
FileBackedFileSystem is one that's backed by (mounted from) a file, in other
words one that has a "source" of the mount; that doesn't mean it deals in
blocks. The hierarchy now becomes:
* FS
* ProcFS
* DevPtsFS
* TmpFS
* FileBackedFS
* (future) Plan9FS
* BlockBasedFS
* Ext2FS
These APIs were clearly modeled after Ext2FS internals, and make perfect sense
in Ext2FS context. The new APIs are more generic, and map better to the
semantics exported to the userspace, where inode identifiers only appear in
stat() and readdir() output, but never in any input.
This will also hopefully reduce the potential for races (see commit c44b4d61f3).
Lastly, this makes it way more viable to implement a filesystem that only
synthesizes its inodes lazily when queried, and destroys them when they are no
longer in use. With inode identifiers being used to reference inodes, the only
choice for such a filesystem is to persist any inode it has given out the
identifier for, because it might be queried at any later time. With direct
references to inodes, the filesystem will know when the last reference is
dropped and the inode can be safely destroyed.
Allow file system implementation to return meaningful error codes to
callers of the FileDescription::read_entire_file(). This allows both
Process::sys$readlink() and Process::sys$module_load() to return more
detailed errors to the user.
read_block() and write_block() now accept the count (how many bytes to read
or write) and offset (where in the block to start; defaults to 0). Using these
new APIs, we can avoid doing copies between intermediary buffers in a lot more
cases. Hopefully this improves performance or something.
In contrast to the previous patchset that was reverted, this time we use
a "special" method to access a file with block size of 512 bytes (like
a harddrive essentially).