Commit graph

1312 commits

Author SHA1 Message Date
Andreas Kling
ddaedbca87 Kernel: Allow sys$rename() to rename symlinks
Previously, this syscall would try to rename the target of the link,
not the link itself.
2020-12-27 15:38:07 +01:00
Andreas Kling
0e2b7f9c9a Kernel: Remove the per-process icon_id and sys$set_process_icon()
This was a goofy kernel API where you could assign an icon_id (int) to
a process which referred to a global shbuf with a 16x16 icon bitmap
inside it.

Instead of this, programs that want to display a process icon now
retrieve it from the process executable instead.
2020-12-27 01:16:56 +01:00
Andreas Kling
21ccbc2167 Kernel: Expose process executable paths in /proc/all 2020-12-27 01:16:56 +01:00
AnotherTest
a9184fcb76 Kernel: Implement unveil() as a prefix-tree
Fixes #4530.
2020-12-26 11:54:54 +01:00
Andreas Kling
82f86e35d6 Kernel+LibC: Introduce a "dumpable" flag for processes
This new flag controls two things:
- Whether the kernel will generate core dumps for the process
- Whether the EUID:EGID should own the process's files in /proc

Processes are automatically made non-dumpable when their EUID or EGID is
changed, either via syscalls that specifically modify those ID's, or via
sys$execve(), when a set-uid or set-gid program is executed.

A process can change its own dumpable flag at any time by calling the
new sys$prctl(PR_SET_DUMPABLE) syscall.

Fixes #4504.
2020-12-25 19:35:55 +01:00
Andreas Kling
3c9bd911b8 Kernel: Make /proc/PID directories owned by the EUID:EGID
This is instead of the UID:GID, since that was allowing some very bad
information leaks like spawning "su" as an unprivileged user and having
full /proc access to it.

Work towards #4504.
2020-12-25 19:35:55 +01:00
Brendan Coles
b156c5a8eb ProcFS: pid_vm: Replace duplicated purgeable key with kernel+cacheable
ProcFS /proc/<pid>/vm map info no longer contains two `purgeable` keys.

The second `purgeable` key has been removed and replaced with keys for
`kernel` and `cacheable`.
2020-12-24 10:26:39 +01:00
Andreas Kling
51713901b1 Kernel: Tweak parameter name in Inode::read_entire()
This is a descriptION, not a descriptOR. :^)
2020-12-23 20:36:14 +01:00
Andreas Kling
b452dd13b6 Kernel: Allow sys$chmod() to modify the set-gid bit
We were incorrectly masking off the set-gid bit.

Fixes #4060.
2020-12-22 17:48:42 +01:00
Tom
5f51d85184 Kernel: Improve time keeping and dramatically reduce interrupt load
This implements a number of changes related to time:
* If a HPET is present, it is now used only as a system timer, unless
  the Local APIC timer is used (in which case the HPET timer will not
  trigger any interrupts at all).
* If a HPET is present, the current time can now be as accurate as the
  chip can be, independently from the system timer. We now query the
  HPET main counter for the current time in CPU #0's system timer
  interrupt, and use that as a base line. If a high precision time is
  queried, that base line is used in combination with quering the HPET
  timer directly, which should give a much more accurate time stamp at
  the expense of more overhead. For faster time stamps, the more coarse
  value based on the last interrupt will be returned. This also means
  that any missed interrupts should not cause the time to drift.
* The default system interrupt rate is reduced to about 250 per second.
* Fix calculation of Thread CPU usage by using the amount of ticks they
  used rather than the number of times a context switch happened.
* Implement CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE and use it
  for most cases where precise timestamps are not needed.
2020-12-21 18:26:12 +01:00
Lenny Maiorani
765936ebae
Everywhere: Switch from (void) to [[maybe_unused]] (#4473)
Problem:
- `(void)` simply casts the expression to void. This is understood to
  indicate that it is ignored, but this is really a compiler trick to
  get the compiler to not generate a warning.

Solution:
- Use the `[[maybe_unused]]` attribute to indicate the value is unused.

Note:
- Functions taking a `(void)` argument list have also been changed to
  `()` because this is not needed and shows up in the same grep
  command.
2020-12-21 00:09:48 +01:00
Andreas Kling
8e79bde2b7 Kernel: Move KBufferBuilder to the fallible KBuffer API
KBufferBuilder::build() now returns an OwnPtr<KBuffer> and can fail.
Clients of the API have been updated to handle that situation.
2020-12-18 19:22:26 +01:00
Andreas Kling
bcd2844439 TmpFS: Use fallible KBuffer API
If allocation fails, some TmpFS operations can now fail with ENOMEM.
2020-12-18 19:22:26 +01:00
Andreas Kling
47da86d136 Ext2FS: Fail the mount if BGD table cache allocation fails
Instead of asserting if we can't allocate enough memory for a BGD table
cache, just fail the mount instead.
2020-12-18 19:22:26 +01:00
Itamar
345abc3132 Kernel: Move InodeWatcher::Event into Kernel/API/InodeWatcherEvent
This allows userspace code to parse these events.
2020-12-14 23:05:53 +01:00
Itamar
efe4da57df Loader: Stabilize loader & Use shared libraries everywhere :^)
The dynamic loader is now stable enough to be used everywhere in the
system - so this commit does just that.
No More .a Files, Long Live .so's!
2020-12-14 23:05:53 +01:00
Tom
da5cc34ebb Kernel: Fix some issues related to fixes and block conditions
Fix some problems with join blocks where the joining thread block
condition was added twice, which lead to a crash when trying to
unblock that condition a second time.

Deferred block condition evaluation by File objects were also not
properly keeping the File object alive, which lead to some random
crashes and corruption problems.

Other problems were caused by the fact that the Queued state didn't
handle signals/interruptions consistently. To solve these issues we
remove this state entirely, along with Thread::wait_on and change
the WaitQueue into a BlockCondition instead.

Also, deliver signals even if there isn't going to be a context switch
to another thread.

Fixes #4336 and #4330
2020-12-12 21:28:12 +01:00
Tom
046d6855f5 Kernel: Move block condition evaluation out of the Scheduler
This makes the Scheduler a lot leaner by not having to evaluate
block conditions every time it is invoked. Instead evaluate them as
the states change, and unblock threads at that point.

This also implements some more waitid/waitpid/wait features and
behavior. For example, WUNTRACED and WNOWAIT are now supported. And
wait will now not return EINTR when SIGCHLD is delivered at the
same time.
2020-11-30 13:17:02 +01:00
Tom
6cb640eeba Kernel: Move some time related code from Scheduler into TimeManagement
Use the TimerQueue to expire blocking operations, which is one less thing
the Scheduler needs to check on every iteration.

Also, add a BlockTimeout class that will automatically handle relative or
absolute timeouts as well as overriding timeouts (e.g. socket timeouts)
more consistently.

Also, rework the TimerQueue class to be able to fire events from
any processor, which requires Timer to be RefCounted. Also allow
creating id-less timers for use by blocking operations.
2020-11-30 13:17:02 +01:00
Andreas Kling
76308c2e1f Kernel: Reduce ByteBuffer thrashing in inode block list generation
Instead of creating and destroying a new ByteBuffer for every block we
process during block list generation, just use stack memory instead.
2020-11-24 21:29:08 +01:00
Andreas Kling
5f2f31861c Kernel: Use a doubly-linked list for the BlockBasedFS cache
This makes misses in the BlockBasedFS's LRU block cache faster by
storing the cache entries in one of two doubly-linked list.

Dirty and clean cache entries are kept in two separate lists, and
move between them when their state changes. This can probably be
improved upon further.
2020-11-24 16:42:01 +01:00
Andreas Kling
3e3a72f2a2 Ext2FS: Oops, fix forgotten assignment in Ext2FSInode::resize()
If the inode's block list cache is empty, we forgot to assign the
result of computing the block list. The fact that this worked anyway
makes me wonder when we actually don't have a cache..

Thanks to szyszkienty for spotting this! :^)
2020-11-24 16:16:09 +01:00
Andreas Kling
a6a3c20071 Kernel: Add a fast lookup table to the BlockBasedFS disk cache
Instead of doing a linear scan of the entire cache when doing a lookup,
we now have a nice O(1) HashMap in front of the cache.

The cache miss case can still be improved, this patch really only helps
the cache hit case.

This dramatically improves cached filesystem I/O. :^)
2020-11-24 13:40:54 +01:00
Andreas Kling
20205708b9 Ext2FS: Use cached inode block list in resize() if available
If we have already cached the block list of an Ext2FSInode, we can save
a lot of time by not regenerating it.
2020-11-24 13:40:45 +01:00
Andreas Kling
541579bc04 Kernel: Remove unnecessary SmapDisablers in FileDescription
Since we're using UserOrKernelBuffers, SMAP will be automatically
disabled when we actually access the buffer later on. There's no need
to disable it wholesale across the entire read/write operations.
2020-11-24 11:26:40 +01:00
Sergey Bugaev
098070b767 Kernel: Add unveil('b')
This is a new "browse" permission that lets you open (and subsequently list
contents of) directories underneath the path, but not regular files or any other
types of files.
2020-11-23 18:37:40 +01:00
Andreas Kling
dfce9051fa ProcFS: Take the "all inodes" lock when generating /proc/inodes
Otherwise the kernel asserts.
2020-11-23 16:19:30 +01:00
Andreas Kling
bb9c705fc2 Ext2FS: Move some EXT2_DEBUG logging behind EXT2_VERY_DEBUG
This makes the build actually somewhat usable with EXT2_DEBUG. :^)
2020-11-23 16:08:42 +01:00
Andreas Kling
df758a5a51 Ext2FS: Clear out the direct block list when an inode is resized to 0
e2fsck was complaining about blocks being allocated in an inode's list
of direct blocks while at the same time being free in the block bitmap.

It was easy to reproduce by creating a file with non-zero length and
then truncating it. This fixes the issue by clearing out the direct
block list when resizing a file to 0.
2020-11-23 14:08:50 +01:00
Andreas Kling
abe9cec612 TmpFS: Set the root inode's timestamp to the current time
cc @bcoles :^)
2020-11-14 10:44:47 +01:00
Tom
75f61fe3d9 AK: Make RefPtr, NonnullRefPtr, WeakPtr thread safe
This makes most operations thread safe, especially so that they
can safely be used in the Kernel. This includes obtaining a strong
reference from a weak reference, which now requires an explicit
call to WeakPtr::strong_ref(). Another major change is that
Weakable::make_weak_ref() may require the explicit target type.
Previously we used reinterpret_cast in WeakPtr, assuming that it
can be properly converted. But WeakPtr does not necessarily have
the knowledge to be able to do this. Instead, we now ask the class
itself to deliver a WeakPtr to the type that we want.

Also, WeakLink is no longer specific to a target type. The reason
for this is that we want to be able to safely convert e.g. WeakPtr<T>
to WeakPtr<U>, and before this we just reinterpret_cast the internal
WeakLink<T> to WeakLink<U>, which is a bold assumption that it would
actually produce the correct code. Instead, WeakLink now operates
on just a raw pointer and we only make those constructors/operators
available if we can verify that it can be safely cast.

In order to guarantee thread safety, we now use the least significant
bit in the pointer for locking purposes. This also means that only
properly aligned pointers can be used.
2020-11-10 19:11:52 +01:00
Andreas Kling
1da828b8bf Ext2FS: Zero out inode metadata when deleting them
This isn't strictly necessary but it seems like a reasonable thing
to be doing. Note that we still populate the dtime field with the
time of deletion.
2020-11-07 17:48:22 +01:00
Andreas Kling
bab24ce34c Ext2FS: Deallocate block list meta blocks when freeing an inode
When computing the list of blocks to deallocate when freeing an inode,
we would stop collecting blocks after reaching the inode's block count.
Since we're getting rid of the inode, we need to also include the meta
blocks used by the on-disk block list itself.
2020-11-07 16:45:03 +01:00
Andreas Kling
a28f29c82c Kernel+LibC: Don't allow a directory to become a subdirectory of itself
If you try to do this (e.g "mv directory directory"), sys$rename() will
now fail with EDIRINTOSELF.

Dr. POSIX says we should return EINVAL for this, but a custom error
code allows us to print a much more helpful error message when this
problem occurs. :^)
2020-11-01 19:21:19 +01:00
Andreas Kling
a316ca0e0d TmpFS: Don't allow file names longer than NAME_MAX
Fixes #3636.
2020-10-22 18:59:00 +02:00
Lenny Maiorani
d1fe6a0b53
Everywhere: Redundant inline specifier on constexpr functions (#3807)
Problem:
- `constexpr` functions are decorated with the `inline` specifier
  keyword. This is redundant because `constexpr` functions are
  implicitly `inline`.
- [dcl.constexpr], §7.1.5/2 in the C++11 standard): "constexpr
  functions and constexpr constructors are implicitly inline (7.1.2)".

Solution:
- Remove the redundant `inline` keyword.
2020-10-20 18:08:13 +02:00
Andreas Kling
eeffd5be07 Ext2FS: Fix block allocation ignoring the very last block group
The block group indices are 1-based for some reason. Because of that,
we were forgetting to check in the very last block group when doing
block allocation. This caused block allocation to fail even when the
superblock indicated that we had free blocks.

Fixes #3674.
2020-10-07 13:42:17 +02:00
Linus Groh
bcfc6f0c57 Everywhere: Fix more typos 2020-10-03 12:36:49 +02:00
Luke
d79194d87f Kernel: Return early in create_inode if name is too long 2020-09-28 21:52:31 +02:00
Ben Wiederhake
64cc3f51d0 Meta+Kernel: Make clang-format-10 clean 2020-09-25 21:18:17 +02:00
Andreas Kling
2cb32f8356 Kernel: Let InodeWatcher track child inode numbers instead of names
First of all, this fixes a dumb info leak where we'd write kernel heap
addresses (StringImpl*) into userspace memory when reading a watcher.

Instead of trying to pass names to userspace, we now simply pass the
child inode index. Nothing in userspace makes use of this yet anyway,
so it's not like we're breaking anything. We'll see how this evolves.
2020-09-19 16:39:52 +02:00
Andreas Kling
55dd13ccac Kernel: Don't assert when reading too little from an InodeWatcher
If you provide a buffer that's too small, we'll still dequeue an event
and write whatever fits in the provided buffer.
2020-09-19 15:39:53 +02:00
Tom
ba238ac62a Kernel: Simplify ProcFS callbacks by using function pointers directly 2020-09-19 01:22:30 +02:00
asynts
0579a2db34 Kernel: Fix kernel crash in get_dir_entries when buffer too small.
Before e06362de9487806df92cf2360a42d3eed905b6bf this was a sneaky buffer
overflow. BufferStream did not do range checking and continued to write
past the allocated buffer (the size of which was controlled by the
user.)

The issue surfaced after my changes because OutputMemoryStream does
range checking.

Not sure how exploitable that bug was, directory entries are somewhat
controllable by the user but the buffer was on the heap, so exploiting
that should be tough.
2020-09-16 17:10:04 +02:00
asynts
206dcd84a6 FileSystem: Use OutputMemoryStream instead of BufferStream. 2020-09-15 20:36:45 +02:00
Tom
c8d9f1b9c9 Kernel: Make copy_to/from_user safe and remove unnecessary checks
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.

So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.

To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.

Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
2020-09-13 21:19:15 +02:00
Ben Wiederhake
0d79e57c4d Kernel: Fix various forward declarations
I decided to modify MappedROM.h because all other entried in Forward.h
are also classes, and this is visually more pleasing.

Other than that, it just doesn't make any difference which way we resolve
the conflicts.
2020-09-12 13:46:15 +02:00
Tom
0fab0ee96a Kernel: Rename Process::is_ring0/3 to Process::is_kernel/user_process
Since "rings" typically refer to code execution and user processes
can also execute in ring 0, rename these functions to more accurately
describe what they mean: kernel processes and user processes.
2020-09-10 19:57:15 +02:00
asynts
ec1080b18a Refactor: Replace usages of FixedArray with Vector. 2020-09-08 14:01:21 +02:00
Andreas Kling
4527d9852a Kernel: Track time-of-last-write in SlavePTY and report it as mtime 2020-09-06 18:48:24 +02:00