This actually allows us to re-introduce the ldd utility as a symlink to
our dynamic loader, so now ldd behaves exactly like on Linux - it will
load all dynamic dependencies for an ELF exectuable.
This has the advantage that running ldd on an ELF executable will
provide an exact preview of how the order in which the dynamic loader
loads the executable and its dependencies.
We add a prctl option which would be called once after the dynamic
loader has finished to do text relocations before calling the actual
program entry point.
This change makes it much more obvious when we are allowed to change
a region protection access from being writable to executable.
The dynamic loader should be able to do this, but after a certain point
it is obvious that such mechanism should be disabled.
This flag is set only once, and should never reset once it has been set,
making it an ideal SetOnce use-case.
It also simplifies the expected conditions for the enabling prctl call,
as we don't expect a boolean flag, but rather the specific prctl option
will always set (enable) Process' AddressSpace syscall region enforcing.
In particular, define a static LibC library *in LibC's CMakeLists* and
use it in DynamicLoader. This is similar to the way LibELF is included
in DynamicLoader.
Additionally, compile DynamicLoader with -ffunction-sections,
-fdata-sections, and -Wl,--gc-sections. This brings the loader size from
~2Mb to ~1Mb with debug symbols and from ~500Kb to ~150Kb without. Also,
this makes linking DynamicLoader with LibTimeZone unnecessary.
The following command was used to clang-format these files:
clang-format-18 -i $(find . \
-not \( -path "./\.*" -prune \) \
-not \( -path "./Base/*" -prune \) \
-not \( -path "./Build/*" -prune \) \
-not \( -path "./Toolchain/*" -prune \) \
-not \( -path "./Ports/*" -prune \) \
-type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")
There are a couple of weird cases where clang-format now thinks that a
pointer access in an initializer list, e.g. `m_member(ptr->foo)`, is a
lambda return statement, and it puts spaces around the `->`.
Weak symbols can be left unresolved, and code is supposed to check if
they're null at runtime before using them. Handle them appropriately
instead of refusing to load.
The strategy of recreating an object for the second time after
DynamicLoader::map is interesting, of course, but it doesn't help
comprehensibility of the code, unfortunately.
Previously, the actual behavior of magic lookup and one described in its
commit description have not matched. Instead of being weak definitions
in a library that is always in the end of load order, the definitions
were normal ones and thus were able to override other weak definitions
in LibC. While this was consistent with how DynamicLoader resolves
ambiguity between normal and weak relocations, this is not the behavior
POSIX mandates -- we should always choose first available definition wrt
load order. To fix this problem, the patch makes sure we don't define
any of magic symbols in LibC.
In addition to this, it makes all provided magic symbols functions
(instead of objects), what renders MagicWeakSymbol class unnecessary.
We currently only supported Variant II which is used by x86-64.
Variant I is used by both AArch64 (when using the traditional
non-TLSDESC model) and RISC-V, although with small differences.
The TLS layout for Variant I is essentially flipped. The static TLS
blocks are after the thread pointer for Variant I, while on Variant II
they are before it.
Some code using ELF TLS already worked on AArch64 and RISC-V even though
we only support Variant II. This is because only the local-exec model
directly uses TLS offsets, other models use relocations or
__tls_get_addr().
This removes the allocate_tls syscall and adds an archctl option to set
the fs_base for the current thread on x86-64, since you can't set that
register from userspace. enter_thread_context loads the fs_base for the
next thread on each context switch.
This also moves tpidr_el0 (the thread pointer register on AArch64) to
the register state, so it gets properly saved/restored on context
switches.
The userspace TLS allocation code is kept pretty similar to the original
kernel TLS code, aside from a couple of style changes.
We also have to add a new argument "tls_pointer" to
SC_create_thread_params, as we otherwise can't prevent race conditions
between setting the thread pointer register and signal handling code
that might be triggered before the thread pointer was set, which could
use TLS.
These changes are compatible with clang-format 16 and will be mandatory
when we eventually bump clang-format version. So, since there are no
real downsides, let's commit them now.
We currently expect that the relocation type numbers are unique across
all architectures. But RISC-V and x86_64 use the same numbers for
different relocation types (R_X86_64_COPY = R_RISCV_JUMP_SLOT = 5).
So create a generic reloc type enum which maps to the arch-specific
reloc types instead of checking for all arch reloc types individually
everywhere.
This works by defining a set of weak symbols in dynamic linker whose
value would be provided by it. This has the same effect as preloading
library that magically knows right addresses of functions shared between
dynamic linker and LibC.
We were previously passing the same information by rewriting values
based on hardcoded library name, so the new approach seems a little
nicer to me.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
It's possible for a malloc inside load_program_headers() to steal the
reserved memory space we created for the program headers. Remove the
reservation later in the method.
Introduces new builders, mainly `SectionTable` and `StringTable`, and a
final `build_elf_image` to merge everything into a single memory image.
Each of the builders are fully detached from one another, although
StringTable provides an extra API to remove steps when using it with a
SectionTable.
This automates the part of figuring out and properly writing offsets to
headers, and making sure all required data is properly copied and
referenced in the final image.
The dynamic loader can't reasonably run static-pie ELF. static-pie ELFs
might run executable code that invokes syscalls outside of the defined
syscall memory executable (code) region security measure we implement.
Known examples of static-pie ELF objects are ELF packers, and the actual
system-provided dynamic loader itself.
This commit removes DeprecatedString's "null" state, and replaces all
its users with one of the following:
- A normal, empty DeprecatedString
- Optional<DeprecatedString>
Note that null states of DeprecatedFlyString/StringView/etc are *not*
affected by this commit. However, DeprecatedString::empty() is now
considered equal to a null StringView.
We currently don't call any DT_FINI_ARRAY functions, so change that.
The call to `_fini` in `exit` is unnecessary, as we now call the
function referenced by DT_FINI in `__call_fini_functions`.
The arguments are passed on registers, so if we pass only 3 defined
arguments then the fourth argument for the prctl syscall could have
garbage value within it.
To avoid possible bugs, always pass 3 arguments to a raw syscall prctl
call in addition to the prctl sub-option (the first argument).
MAP_FILE is not in POSIX, and is simply in most LibCs as a "default"
mode. Our own LibC defines it as 0, meaning "no flags". It is also not
defined in some OS's, such as Haiku. Let's be more portable and not use
the unnecessary flag.
This is a minimal set of changes to allow `serenity.sh build riscv64` to
successfully generate the build environment and start building. This
includes some, but not all, assembly stubs that will be needed later on;
they are currently empty.