The System V ABI requires that the stack is 16-byte aligned on function
call. Confusingly, however, they mean that the stack must be aligned
this way **before** the `CALL` instruction is executed. That instruction
pushes the return value onto the stack, so the callee will actually see
the stack pointer as a value `sizeof(FlatPtr)` smaller.
The signal trampoline was written with this in mind, but `setup_stack`
aligned the entire stack, *including the return address* to a 16-byte
boundary. Because of this, the trampoline subtracted too much from the
stack pointer, thus misaligning it.
This was not a problem on i686 because we didn't execute any
instructions from signal handlers that would require memory operands to
be aligned to more than 4 bytes. This is not the case, however, on
x86_64, where SSE instructions are enabled by default and they require
16-byte aligned operands. Running such instructions raised a GP fault,
immediately killing the offending program with a SIGSEGV signal.
This issue caused TestKernelAlarm to fail in LibC when ran locally, and
at one point, the zsh port was affected too.
Fixes#9291
This change adds a static lock hierarchy / ranking to the Kernel with
the goal of reducing / finding deadlocks when running with SMP enabled.
We have seen quite a few lock ordering deadlocks (locks taken in a
different order, on two different code paths). As we properly annotate
locks in the system, then these facilities will find these locking
protocol violations automatically
The `LockRank` enum documents the various locks in the system and their
rank. The implementation guarantees that a thread holding one or more
locks of a lower rank cannot acquire an additional lock with rank that
is greater or equal to any of the currently held locks.
We previously allowed Thread to exist in a state where its m_name was
null, and had to work around that in various places.
This patch removes that possibility and forces those who would create a
thread (or change the name of one) to provide a NonnullOwnPtr<KString>
with the name.
This expands the reach of error propagation greatly throughout the
kernel. Sadly, it also exposes the fact that we're allocating (and
doing other fallible things) in constructors all over the place.
This patch doesn't attempt to address that of course. That's work for
our future selves.
The default template argument is only used in one place, and it
looks like it was probably just an oversight. The rest of the Kernel
code all uses u8 as the type. So lets make that the default and remove
the unused template argument, as there doesn't seem to be a reason to
allow the size to be customizable.
And also try_create<T> => try_make_ref_counted<T>.
A global "create" was a bit much. The new name matches make<T> better,
which we've used for making single-owner objects since forever.
This patch does three things:
- Convert the global thread list from a HashMap to an IntrusiveList
- Combine the thread list and its lock into a SpinLockProtectedValue
- Customize Thread::unref() so it locks the list while unreffing
This closes the same race window for Thread as @sin-ack's recent changes
did for Process.
Note that the HashMap->IntrusiveList conversion means that we lose O(1)
lookups, but the majority of clients of this list are doing traversal,
not lookup. Once we have an intrusive hashing solution, we should port
this to use that, but for now, this gets rid of heap allocations during
a sensitive time.
The LOCK_DEBUG conditional code is pretty ugly for a feature that we
only use rarely. We can remove a significant amount of this code by
utilizing a zero sized fake type when not building in LOCK_DEBUG mode.
This lets us keep the same API, but just let the compiler optimize it
away when don't actually care about the location the caller came from.
Leave interrupts enabled so that we can still process IRQs. Critical
sections should only prevent preemption by another thread.
Co-authored-by: Tom <tomut@yahoo.com>
By making these functions static we close a window where we could get
preempted after calling Processor::current() and move to another
processor.
Co-authored-by: Tom <tomut@yahoo.com>
Taking a reference or a pointer to a value that's not aligned properly
is undefined behavior. While `[[gnu::packed]]` ensures that reads from
and writes to fields of packed structs is a safe operation, the
information about the reduced alignment is lost when creating pointers
to these values.
Weirdly enough, GCC's undefined behavior sanitizer doesn't flag these,
even though the doc of `-Waddress-of-packed-member` says that it usually
leads to UB. In contrast, x86_64 Clang does flag these, which renders
the 64-bit kernel unable to boot.
For now, the `address-of-packed-member` warning will only be enabled in
the kernel, as it is absolutely crucial there because of KUBSAN, but
might get excessively noisy for the userland in the future.
Also note that we can't append to `CMAKE_CXX_FLAGS` like we do for other
flags in the kernel, because flags added via `add_compile_options` come
after these, so the `-Wno-address-of-packed-member` in the root would
cancel it out.
We were allocating thread FPU state separately in order to ensure a
16-byte alignment. There should be no need to do that.
This patch makes it a regular value member of Thread instead, dodging
one heap allocation during thread creation.
GCC and Clang allow us to inject a call to a function named
__sanitizer_cov_trace_pc on every edge. This function has to be defined
by us. By noting down the caller in that function we can trace the code
we have encountered during execution. Such information is used by
coverage guided fuzzers like AFL and LibFuzzer to determine if a new
input resulted in a new code path. This makes fuzzing much more
effective.
Additionally this adds a basic KCOV implementation. KCOV is an API that
allows user space to request the kernel to start collecting coverage
information for a given user space thread. Furthermore KCOV then exposes
the collected program counters to user space via a BlockDevice which can
be mmaped from user space.
This work is required to add effective support for fuzzing SerenityOS to
the Syzkaller syscall fuzzer. :^) :^)
Depending on the values it might be difficult to figure out whether a
value is decimal or hexadecimal. So let's make this more obvious. Also
this allows copying and pasting those numbers into GNOME calculator and
probably also other apps which auto-detect the base.