* Pass the correct source address for copying tine addr_length.
Previously, this was broken when addr_length was non-nullptr.
* Copy min(sizeof(address), address_length) bytes into address,
instead of sizeof(address), which might be larger than the
user buffer.
* Use sockaddr_storage instead of sockaddr_un. In practice they're
both the same size, but this is what sockaddr_storage is for.
With this (in particular, the first fix), `ue /bin/ntpquery`
actually gets past the recvfrom() call :^)
With this, `ue /bin/ntpquery` can be used to test sendto() and
recvfrom() in ue. (It eventually hits an unimplemented FILD_RM64,
but not before doing emulated network i/o and printing response
details.)
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
From a layering perspective, it's maybe a bit surprising that the
X86::SymbolProvider implementation also lives in LibX86, but since
everything depends on LibELF via LibC, and since all current
LibX86-based disassemblers want to use ELFSymbolProvider, it makes
some amount of sense to put it there.
The SI prefixes "k", "M", "G" mean "10^3", "10^6", "10^9".
The IEC prefixes "Ki", "Mi", "Gi" mean "2^10", "2^20", "2^30".
Let's use the correct name, at least in code.
Only changes the name of the constants, no other behavior change.
This is racy in userspace and non-racy in kernelspace so let's keep
it in kernelspace.
The behavior change where CLOEXEC is preserved when dup2() is called
with (old_fd == new_fd) was good though, let's keep that.
This enables a nice warning in case a function becomes dead code. Also, in case
of signal_trampoline_dummy, marking it external (non-static) prevents it from
being 'optimized away', which would lead to surprising and weird linker errors.
The emulator will now register signal handlers for all possible signals
and act as a translation layer between the kernel and the emulated
process.
To get an accurate simulation of signal handling, we duplicate the same
trampoline mechanism used by the kernel's signal delivery system, and
also use the "sigreturn" syscall to return from a signal handler.
Signal masking is not fully implemented yet, but this is pretty cool!
This virtual syscall works by exec'ing the UserspaceEmulator itself,
with the emulated program's provided arguments as the arguments to the
new UserspaceEmulator instance.
This means that we "follow" exec'ed programs and emulate them as well.
In the future we might want to make this an opt-in (or opt-out, idk)
behavior, but for now it's what we do.
This is really quite cool, I think! :^)
Note that running a setuid program (e.g /bin/ping) in UE does not
actually run uid=0. You'll have to run UE itself as uid=0 if you want
to test programs that do setuid/setgid.
Instead of using SoftCPU::eip() which points at the *next* instruction
most of the time, stash away a "base EIP" so we can use it when making
backtraces. This makes the correct line number show up! :^)
This patch introduces the concept of shadow bits. For every byte of
memory there is a corresponding shadow byte that contains metadata
about that memory.
Initially, the only metadata is whether the byte has been initialized
or not. That's represented by the least significant shadow bit.
Shadow bits travel together with regular values throughout the entire
CPU and MMU emulation. There are two main helper classes to facilitate
this: ValueWithShadow and ValueAndShadowReference.
ValueWithShadow<T> is basically a struct { T value; T shadow; } whereas
ValueAndShadowReference<T> is struct { T& value; T& shadow; }.
The latter is used as a wrapper around general-purpose registers, since
they can't use the plain ValueWithShadow memory as we need to be able
to address individual 8-bit and 16-bit subregisters (EAX, AX, AL, AH.)
Whenever a computation is made using uninitialized inputs, the result
is tainted and becomes uninitialized as well. This allows us to track
this state as it propagates throughout memory and registers.
This patch doesn't yet keep track of tainted flags, that will be an
important upcoming improvement to this.
I'm sure I've messed up some things here and there, but it seems to
basically work, so we have a place to start! :^)