We now use minherit(MAP_INHERIT_ZERO) to create a gettid() cache that
is automatically invalidated on fork(). This is needed since the TID
will be different in a forked child, and so we can't have a stale
cached TID lying around.
This is a gigantic speedup for LibJS (and everyone else too) :^)
Add an extra out-parameter to shbuf_get() that receives the size of the
shared buffer. That way we don't need to make a separate syscall to
get the size, which we always did immediately after.
This feels a lot more consistent and Unixy:
create_shared_buffer() => shbuf_create()
share_buffer_with() => shbuf_allow_pid()
share_buffer_globally() => shbuf_allow_all()
get_shared_buffer() => shbuf_get()
release_shared_buffer() => shbuf_release()
seal_shared_buffer() => shbuf_seal()
get_shared_buffer_size() => shbuf_get_size()
Also, "shared_buffer_id" is shortened to "shbuf_id" all around.
This patch adds a crappy pread() just to get "git" working locally.
A proper version would be implemented in the kernel so that we don't
have to mess with the file descriptor's offset at all.
We should only execute the filename verbatim if it contains a slash (/)
character somewhere. Otherwise, we need to look through the entries in
the PATH environment variable.
This fixes an issue where you could easily "override" system programs
by placing them in a directory you control, and then waiting for
someone to come there and run e.g "ls" :^)
Test: LibC/exec-should-not-search-current-directory.cpp
This syscall is a complement to pledge() and adds the same sort of
incremental relinquishing of capabilities for filesystem access.
The first call to unveil() will "drop a veil" on the process, and from
now on, only unveiled parts of the filesystem are visible to it.
Each call to unveil() specifies a path to either a directory or a file
along with permissions for that path. The permissions are a combination
of the following:
- r: Read access (like the "rpath" promise)
- w: Write access (like the "wpath" promise)
- x: Execute access
- c: Create/remove access (like the "cpath" promise)
Attempts to open a path that has not been unveiled with fail with
ENOENT. If the unveiled path lacks sufficient permissions, it will fail
with EACCES.
Like pledge(), subsequent calls to unveil() with the same path can only
remove permissions, not add them.
Once you call unveil(nullptr, nullptr), the veil is locked, and it's no
longer possible to unveil any more paths for the process, ever.
This concept comes from OpenBSD, and their implementation does various
things differently, I'm sure. This is just a first implementation for
SerenityOS, and we'll keep improving on it as we go. :^)
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
Since a chroot is in many ways similar to a separate root mount, we can also
apply mount flags to it as if it was an actual mount. These flags will apply
whenever the chrooted process accesses its root directory, but not when other
processes access this same directory for the outside. Since it's common to
chdir("/") immediately after chrooting (so that files accessed through the
current directory inherit the same mount flags), this effectively allows one to
apply additional limitations to a process confined inside a chroot.
To this effect, sys$chroot() gains a mount_flags argument (exposed as
chroot_with_mount_flags() in userspace) which can be set to all the same values
as the flags argument for sys$mount(), and additionally to -1 to keep the flags
set for that file system. Note that passing 0 as mount_flags will unset any
flags that may have been set for the file system, not keep them.
This patch implements basic support for OpenBSD-style pledge().
pledge() allows programs to incrementally reduce their set of allowed
syscalls, which are divided into categories that each make up a subset
of POSIX functionality.
If a process violates one of its pledged promises by attempting to call
a syscall that it previously said it wouldn't call, the process is
immediately terminated with an uncatchable SIGABRT.
This is by no means complete, and we'll need to add more checks in
various places to ensure that promises are being kept.
But it is pretty cool! :^)
At the moment, the actual flags are ignored, but we correctly propagate them all
the way from the original mount() syscall to each custody that resides on the
mounted FS.
The chroot() syscall now allows the superuser to isolate a process into
a specific subtree of the filesystem. This is not strictly permanent,
as it is also possible for a superuser to break *out* of a chroot, but
it is a useful mechanism for isolating unprivileged processes.
The VFS now uses the current process's root_directory() as the root for
path resolution purposes. The root directory is stored as an uncached
Custody in the Process object.
Note that I'm developing some helper types in the Syscall namespace as
I go here. Once I settle on some nice types, I will convert all the
other syscalls to use them as well.
The userspace execve() wrapper now measures all the strings and puts
them in a neat and tidy structure on the stack.
This way we know exactly how much to copy in the kernel, and we don't
have to use the SMAP-violating validate_read_str(). :^)
If we pass a null path to these syscall wrappers, just return EFAULT
directly from the wrapper instead of segfaulting by calling strlen().
This is a compromise, since we now have to pass the path length to the
kernel, so we can't rely on the kernel to tell us that the path is at
a bad memory address.
This can be implemented entirely in userspace by calling tcgetattr().
To avoid screwing up the syscall indexes, this patch also adds a
mechanism for removing a syscall without shifting the index of other
syscalls.
Note that ports will still have to be rebuilt after this change,
as their LibC code will try to make the isatty() syscall on startup.
Have pthread_create() allocate a stack and passing it to the kernel
instead of this work happening in the kernel. The more of this we can
do in userspace, the better.
This patch also unexposes the raw create_thread() and exit_thread()
syscalls since they are now only used by LibPthread anyway.