Commit graph

1350 commits

Author SHA1 Message Date
Ben Wiederhake
143a64f9a2 Kernel: Remove unused includes of Kernel/Debug.h
These instances were detected by searching for files that include
Kernel/Debug.h, but don't match the regex:
\\bdbgln_if\(|_DEBUG\\b
This regex is pessimistic, so there might be more files that don't check
for any real *_DEBUG macro. There seem to be no corner cases anyway.

In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
2023-01-02 20:27:20 -05:00
kleines Filmröllchen
a6a439243f Kernel: Turn lock ranks into template parameters
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
  obtain any lock rank just via the template parameter. It was not
  previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
  of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
  readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
  wanted in the first place (or made use of). Locks randomly changing
  their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
  taken in the right order (with the right lock rank checking
  implementation) as rank information is fully statically known.

This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
2023-01-02 18:15:27 -05:00
Andreas Kling
16f934474f Kernel+Tests: Allow deleting someone else's file in my sticky directory
This should be allowed according to Dr. POSIX. :^)
2023-01-01 10:09:02 +01:00
Andreas Kling
47b9e8e651 Kernel: Annotate VirtualFileSystem::rmdir() errors with spec comments 2023-01-01 10:09:02 +01:00
Andreas Kling
8619f2c6f3 Kernel+Tests: Remove inaccurate FIXME in sys$rmdir()
We were already handling the rmdir("..") case by refusing to remove
directories that were not empty.

This patch removes a FIXME from January 2019 and adds a test. :^)
2023-01-01 10:09:02 +01:00
Andreas Kling
8d781d0216 Kernel+Tests: Make sys$rmdir() fail with EINVAL if basename is "."
Dr. POSIX says that we should reject attempts to rmdir() the file named
"." so this patch does exactly that. We also add a test.

This solves a FIXME from January 2019. :^)
2023-01-01 10:09:02 +01:00
Liav A
91db482ad3 Kernel: Reorganize Arch/x86 directory to Arch/x86_64 after i686 removal
No functional change.
2022-12-28 11:53:41 +01:00
Liav A
5ff318cf3a Kernel: Remove i686 support 2022-12-28 11:53:41 +01:00
Liav A
2e710de2f4 Kernel/FileSystem: Prevent symlink creation in veiled directory paths
Also, try to resolve the target path and check if it is allowed to be
accessed under the unveil rules.
2022-12-21 09:17:09 +00:00
Freakness109
1f1e58ed75 Kernel/Plan9FS: Propagate errors in Plan9FSMessage::append_data 2022-12-17 09:37:04 +00:00
sin-ack
3275015786 Kernel: Implement flock downgrading
This commit makes it possible for a process to downgrade a file lock it
holds from a write (exclusive) lock to a read (shared) lock. For this,
the process must point to the exact range of the flock, and must be the
owner of the lock.
2022-12-11 19:55:37 -07:00
sin-ack
2a502fe232 Kernel+LibC+LibCore+UserspaceEmulator: Implement faccessat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
sin-ack
fa692e13f9 Kernel: Use real UID/GID when checking for file access
This aligns the rest of the system with POSIX, who says that access(2)
must check against the real UID and GID, not effective ones.
2022-12-11 19:55:37 -07:00
sin-ack
3472c84d14 Kernel: Remove InodeMetadata::may_{read,write,execute}(Process const&)
These have no definition and are never used.
2022-12-11 19:55:37 -07:00
sin-ack
d5fbdf1866 Kernel+LibC+LibCore: Implement renameat(2)
Now with the ability to specify different bases for the old and new
paths.
2022-12-11 19:55:37 -07:00
Liav A
aa9fab9c3a Kernel/FileSystem: Convert the mount table from Vector to IntrusiveList
The fact that we used a Vector meant that even if creating a Mount
object succeeded, we were still at a risk that appending to the actual
mounts Vector could fail due to OOM condition. To guard against this,
the mount table is now an IntrusiveList, which always means that when
allocation of a Mount object succeeded, then inserting that object to
the list will succeed, which allows us to fail early in case of OOM
condition.
2022-12-09 23:29:33 -07:00
Liav A
6a555af1f1 Kernel: Add callback on ".." directory entry for a TmpFS root directory 2022-12-09 22:59:08 -07:00
Liav A
69f41eb062 Kernel: Reject create links on paths that were not unveiled as writable
This solves one of the security issues being mentioned in issue #15996.
We simply don't allow creating hardlinks on paths that were not unveiled
as writable to prevent possible bypass on a certain path that was
unveiled as non-writable.
2022-12-03 11:00:34 -07:00
Liav A
0bb7c8f4c4 Kernel+SystemServer: Don't hardcode coredump directory path
Instead, allow userspace to decide on the coredump directory path. By
default, SystemServer sets it to the /tmp/coredump directory, but users
can now change this by writing a new path to the sysfs node at
/sys/kernel/variables/coredump_directory, and also to read this node to
check where coredumps are currently generated at.
2022-12-03 05:56:59 -07:00
Liav A
7dcf8f971b Kernel: Rename SysFSSystemBoolean => SysFSSystemBooleanVariable 2022-12-03 05:56:59 -07:00
Liav A
95d8aa2982 Kernel: Allow read access sparingly to some /sys/kernel directory nodes
Those nodes are not exposing any sensitive information so there's no
harm in exposing them.
2022-12-03 05:47:58 -07:00
Liav A
1ca0ac5207 Kernel: Disallow jailed processes to read files in /sys/kernel directory
By default, disallow reading of values in that directory. Later on, we
will enable sparingly read access to specific files.

The idea that led to this mechanism was suggested by Jean-Baptiste
Boric (also known as boricj in GitHub), to prevent access to sensitive
information in the SysFS if someone adds a new file in the /sys/kernel
directory.
2022-12-03 05:47:58 -07:00
Liav A
2e55956784 Kernel: Forbid access to /sys/kernel/power_state for Jailed processes
There's simply no benefit in allowing sandboxed programs to change the
power state of the machine, so disallow writes to the mentioned node to
prevent malicious programs to request that.
2022-12-03 05:47:58 -07:00
Andreas Kling
4dd148f07c Kernel: Add File::is_regular_file()
This makes it easy and expressive to check if a File is a regular file.
2022-11-29 11:09:19 +01:00
Liav A
718ae68621 Kernel+LibCore+LibC: Implement support for forcing unveil on exec
To accomplish this, we add another VeilState which is called
LockedInherited. The idea is to apply exec unveil data, similar to
execpromises of the pledge syscall, on the current exec'ed program
during the execve sequence. When applying the forced unveil data, the
veil state is set to be locked but the special state of LockedInherited
ensures that if the new program tries to unveil paths, the request will
silently be ignored, so the program will continue running without
receiving an error, but is still can only use the paths that were
unveiled before the exec syscall. This in turn, allows us to use the
unveil syscall with a special utility to sandbox other userland programs
in terms of what is visible to them on the filesystem, and is usable on
both programs that use or don't use the unveil syscall in their code.
2022-11-26 12:42:15 -07:00
sin-ack
3b03077abb Kernel: Update the ".." inode for directories after a rename
Because the ".." entry in a directory is a separate inode, if a
directory is renamed to a new location, then we should update this entry
the point to the new parent directory as well.

Co-authored-by: Liav A <liavalb@gmail.com>
2022-11-25 17:33:05 +01:00
Andreas Kling
a9d55ddf57 Kernel/TmpFS: Update mtime instead of ctime when asked to update mtime 2022-11-24 16:56:27 +01:00
Andreas Kling
10fa72d451 Kernel: Use AK::Time for InodeMetadata timestamps instead of time_t
Before this change, we were truncating the nanosecond part of file
timestamps in many different places.
2022-11-24 16:56:27 +01:00
Andreas Kling
fb00d3ed25 Kernel+lsirq: Track per-CPU IRQ handler call counts
Each GenericInterruptHandler now tracks the number of calls that each
CPU has serviced.

This takes care of a FIXME in the /sys/kernel/interrupts generator.

Also, the lsirq command line tool now displays per-CPU call counts.
2022-11-19 15:39:30 +01:00
Andreas Kling
9b3db63e14 Kernel: Rename GenericInterruptHandler "invoking count" to "call count" 2022-11-19 15:39:30 +01:00
Liav A
31d4c07dee Kernel: Add missing includes for Mount.h file 2022-11-11 10:25:54 +01:00
Liav A
3cc0d60141 Kernel: Split the Ext2FileSystem.{cpp,h} files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
1c91881a1d Kernel: Split the ISO9660FileSystem.{cpp,h} files to smaller components 2022-11-08 02:54:48 -07:00
Liav A
fca3b7f1f9 Kernel: Split the DevPtsFS files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
3fc52a6d1c Kernel: Split the Plan9FileSystem.{cpp,h} file into smaller components 2022-11-08 02:54:48 -07:00
Liav A
3906dd3aa3 Kernel: Split the ProcFS core file into smaller components 2022-11-08 02:54:48 -07:00
Liav A
e882b2ed05 Kernel: Split the FATFileSystem.{cpp,h} files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
5e6101dd3e Kernel: Split the TmpFS core files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
f53149d5f6 Kernel: Split the SysFS core files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
5e062414c1 Kernel: Add support for jails
Our implementation for Jails resembles much of how FreeBSD jails are
working - it's essentially only a matter of using a RefPtr in the
Process class to a Jail object. Then, when we iterate over all processes
in various cases, we could ensure if either the current process is in
jail and therefore should be restricted what is visible in terms of
PID isolation, and also to be able to expose metadata about Jails in
/sys/kernel/jails node (which does not reveal anything to a process
which is in jail).

A lifetime model for the Jail object is currently plain simple - there's
simpy no way to manually delete a Jail object once it was created. Such
feature should be carefully designed to allow safe destruction of a Jail
without the possibility of releasing a process which is in Jail from the
actual jail. Each process which is attached into a Jail cannot leave it
until the end of a Process (i.e. when finalizing a Process). All jails
are kept being referenced in the JailManagement. When a last attached
process is finalized, the Jail is automatically destroyed.
2022-11-05 18:00:58 -06:00
Timon Kruiper
0475407f9f Kernel: Remove bunch of unused includes in SysFS/Processes.cpp 2022-10-26 20:01:45 +02:00
Timon Kruiper
97f1fa7d8f Kernel: Include missing headers for various files
With these missing header files, we can now build these files for
aarch64.
2022-10-26 20:01:45 +02:00
Timon Kruiper
fcbb6b79ac Kernel: Don't expose processor information for aarch64 in sysfs
We do not (yet) acquire this information for the aarch64 processors.
2022-10-26 20:01:45 +02:00
Liav A
75f01692b4 Kernel+Userland: Move /sys/firmware/power_state to /sys/kernel directory
Let's put the power_state global node into the /sys/kernel directory,
because that directory represents all global nodes and variables being
related to the Kernel. It's also a mutable node, that is more acceptable
being in the mentioned directory due to the fact that all other files in
the /sys/firmware directory are just firmware blobs and are not mutable
at all.
2022-10-25 15:33:34 -06:00
Liav A
a91589c09b Kernel: Introduce global variables and stats in /sys/kernel directory
The ProcFS is an utter mess currently, so let's start move things that
are not related to processes-info. To ensure it's done in a sane manner,
we start by duplicating all /proc/ global nodes to the /sys/kernel/
directory, then we will move Userland to use the new directory so the
old directory nodes can be removed from the /proc directory.
2022-10-25 15:33:34 -06:00
Liav A
03ae9f94cf Kernel/FileSystem: Remove hardcoded unveil path of /usr/lib/Loader.so
If a program needs to execute a dynamic executable program, then it
should unveil /usr/lib/Loader.so by itself and not rely on the Kernel to
allow using this binary without any sense of respect to unveil promises
being made by the running parent program.
2022-10-24 19:41:32 -06:00
Gunnar Beutner
ce4b66e908 Kernel: Add support for MSG_NOSIGNAL and properly send SIGPIPE
Previously we didn't send the SIGPIPE signal to processes when
sendto()/sendmsg()/etc. returned EPIPE. And now we do.

This also adds support for MSG_NOSIGNAL to suppress the signal.
2022-10-24 15:49:39 +02:00
Liav A
fea3cb5ff9 Kernel/FileSystem: Discard safely filesystems when unmounted last time
This commit reached that goal of "safely discarding" a filesystem by
doing the following:
1. Stop using the s_file_system_map HashMap as it was an unsafe measure
to access pointers of FileSystems. Instead, make sure to register all
FileSystems at the VFS layer, with an IntrusiveList, to avoid problems
related to OOM conditions.
2. Make sure to cleanly remove the DiskCache object from a BlockBased
filesystem, so the destructor of such object will not need to do that in
the destruction point.
3. For ext2 filesystems, don't cache the root inode at m_inode_cache
HashMap. The reason for this is that when unmounting an ext2 filesystem,
we lookup at the cache to see if there's a reference to a cached inode
and if that's the case, we fail with EBUSY. If we keep the m_root_inode
also being referenced at the m_inode_cache map, we have 2 references to
that object, which will lead to fail with EBUSY. Also, it's much simpler
to always ask for a root inode and get it immediately from m_root_inode,
instead of looking up the cache for that inode.
2022-10-22 16:57:52 -04:00
Liav A
24977996a6 Kernel: Append root filesystem to the VFS FileBackedFileSystem list 2022-10-22 16:57:52 -04:00
Liav A
0fd7b688af Kernel: Introduce support for using FileSystem object in multiple mounts
The idea is to enable mounting FileSystem objects across multiple mounts
in contrast to what happened until now - each mount has its own unique
FileSystem object being attached to it.

Considering a situation of mounting a block device at 2 different mount
points at in system, there were a couple of critical flaws due to how
the previous "design" worked:
1. BlockBasedFileSystem(s) that pointed to the same actual device had a
separate DiskCache object being attached to them. Because both instances
were not synchronized by any means, corruption of the filesystem is most
likely achieveable by a simple cache flush of either of the instances.
2. For superblock-oriented filesystems (such as the ext2 filesystem),
lack of synchronization between both instances can lead to severe
corruption in the superblock, which could render the entire filesystem
unusable.
3. Flags of a specific filesystem implementation (for example, with xfs
on Linux, one can instruct to mount it with the discard option) must be
honored across multiple mounts, to ensure expected behavior against a
particular filesystem.

This patch put the foundations to start fix the issues mentioned above.
However, there are still major issues to solve, so this is only a start.
2022-10-22 16:57:52 -04:00