Commit graph

788 commits

Author SHA1 Message Date
Liav A
b536547c52 Process: Use monotonic time for timeouts 2020-03-19 15:48:00 +01:00
Liav A
4484513b45 Kernel: Add new syscall to allow changing the system date 2020-03-19 15:48:00 +01:00
Liav A
9db291d885 Kernel: Introduce the new Time management subsystem
This new subsystem includes better abstractions of how time will be
handled in the OS. We take advantage of the existing RTC timer to aid
in keeping time synchronized. This is standing in contrast to how we
handled time-keeping in the kernel, where the PIT was responsible for
that function in addition to update the scheduler about ticks.
With that new advantage, we can easily change the ticking dynamically
and still keep the time synchronized.

In the process context, we no longer use a fixed declaration of
TICKS_PER_SECOND, but we call the TimeManagement singleton class to
provide us the right value. This allows us to use dynamic ticking in
the future, a feature known as tickless kernel.

The scheduler no longer does by himself the calculation of real time
(Unix time), and just calls the TimeManagment singleton class to provide
the value.

Also, we can use 2 new boot arguments:
- the "time" boot argument accpets either the value "modern", or
  "legacy". If "modern" is specified, the time management subsystem will
  try to setup HPET. Otherwise, for "legacy" value, the time subsystem
  will revert to use the PIT & RTC, leaving HPET disabled.
  If this boot argument is not specified, the default pattern is to try
  to setup HPET.
- the "hpet" boot argumet accepts either the value "periodic" or
  "nonperiodic". If "periodic" is specified, the HPET will scan for
  periodic timers, and will assert if none are found. If only one is
  found, that timer will be assigned for the time-keeping task. If more
  than one is found, both time-keeping task & scheduler-ticking task
  will be assigned to periodic timers.
  If this boot argument is not specified, the default pattern is to try
  to scan for HPET periodic timers. This boot argument has no effect if
  HPET is disabled.

In hardware context, PIT & RealTimeClock classes are merely inheriting
from the HardwareTimer class, and they allow to use the old i8254 (PIT)
and RTC devices, managing them via IO ports. By default, the RTC will be
programmed to a frequency of 1024Hz. The PIT will be programmed to a
frequency close to 1000Hz.

About HPET, depending if we need to scan for periodic timers or not,
we try to set a frequency close to 1000Hz for the time-keeping timer
and scheduler-ticking timer. Also, if possible, we try to enable the
Legacy replacement feature of the HPET. This feature if exists,
instructs the chipset to disconnect both i8254 (PIT) and RTC.
This behavior is observable on QEMU, and was verified against the source
code:
ce967e2f33

The HPETComparator class is inheriting from HardwareTimer class, and is
responsible for an individual HPET comparator, which is essentially a
timer. Therefore, it needs to call the singleton HPET class to perform
HPET-related operations.

The new abstraction of Hardware timers brings an opportunity of more new
features in the foreseeable future. For example, we can change the
callback function of each hardware timer, thus it makes it possible to
swap missions between hardware timers, or to allow to use a hardware
timer for other temporary missions (e.g. calibrating the LAPIC timer,
measuring the CPU frequency, etc).
2020-03-19 15:48:00 +01:00
Alex Muscar
d013753f83
Kernel: Resolve relative paths when there is a veil (#1474) 2020-03-19 09:57:34 +01:00
Andreas Kling
ad92a1e4bc Kernel: Add sys$get_stack_bounds() for finding the stack base & size
This will be useful when implementing conservative garbage collection.
2020-03-16 19:06:33 +01:00
Andreas Kling
3803196edb Kernel: Get rid of SmapDisabler in sys$fstat() 2020-03-10 13:34:24 +01:00
Liav A
0f45a1b5e7 Kernel: Allow to reboot in ACPI via PCI or MMIO access
Also, we determine if ACPI reboot is supported by checking the FADT
flags' field.
2020-03-09 10:53:13 +01:00
Ben Wiederhake
b066586355 Kernel: Fix race in waitid
This is similar to 28e1da344d
and 4dd4dd2f3c.

The crux is that wait verifies that the outvalue (siginfo* infop)
is writable *before* waiting, and writes to it *after* waiting.
In the meantime, a concurrent thread can make the output region
unwritable, e.g. by deallocating it.
2020-03-08 14:12:12 +01:00
Ben Wiederhake
d8cd4e4902 Kernel: Fix race in select
This is similar to 28e1da344d
and 4dd4dd2f3c.

The crux is that select verifies that the filedescriptor sets
are writable *before* blocking, and writes to them *after* blocking.
In the meantime, a concurrent thread can make the output buffer
unwritable, e.g. by deallocating it.
2020-03-08 14:12:12 +01:00
Andreas Kling
b1058b33fb AK: Add global FlatPtr typedef. It's u32 or u64, based on sizeof(void*)
Use this instead of uintptr_t throughout the codebase. This makes it
possible to pass a FlatPtr to something that has u32 and u64 overloads.
2020-03-08 13:06:51 +01:00
Andreas Kling
c6693f9b3a Kernel: Simplify a bunch of dbg() and klog() calls
LogStream can handle VirtualAddress and PhysicalAddress directly.
2020-03-06 15:00:44 +01:00
Liav A
85eb1d26d5 Kernel: Run clang-format on Process.cpp & ACPIDynamicParser.h 2020-03-05 19:04:04 +01:00
Liav A
1b8cd6db7b Kernel: Call ACPI reboot method first if possible
Now we call ACPI reboot method first if possible, and if ACPI reboot is
not available, we attempt to reboot via the keyboard controller.
2020-03-05 19:04:04 +01:00
Ben Wiederhake
4dd4dd2f3c Kernel: Fix race in clock_nanosleep
This is a complete fix of clock_nanosleep, because the thread holds the
process lock again when returning from sleep()/sleep_until().
Therefore, no further concurrent invalidation can occur.
2020-03-03 20:13:32 +01:00
Liav A
0fc60e41dd Kernel: Use klog() instead of kprintf()
Also, duplicate data in dbg() and klog() calls were removed.
In addition, leakage of virtual address to kernel log is prevented.
This is done by replacing kprintf() calls to dbg() calls with the
leaked data instead.
Also, other kprintf() calls were replaced with klog().
2020-03-02 22:23:39 +01:00
Andreas Kling
47beab926d Kernel: Remove ability to create kernel-only regions at user addresses
This was only used by the mechanism for mapping executables into each
process's own address space. Now that we remap executables on demand
when needed for symbolication, this can go away.
2020-03-02 11:20:34 +01:00
Andreas Kling
e56f8706ce Kernel: Map executables at a kernel address during ELF load
This is both simpler and more robust than mapping them in the process
address space.
2020-03-02 11:20:34 +01:00
Andreas Kling
678c87087d Kernel: Load executables on demand when symbolicating
Previously we would map the entire executable of a program in its own
address space (but make it unavailable to userspace code.)

This patch removes that and changes the symbolication code to remap
the executable on demand (and into the kernel's own address space
instead of the process address space.)

This opens up a couple of further simplifications that will follow.
2020-03-02 11:20:34 +01:00
Andreas Kling
0acac186fb Kernel: Make the "entire executable" region shared
This makes Region::clone() do the right thing with it on fork().
2020-03-02 06:13:29 +01:00
Andreas Kling
5c2a296a49 Kernel: Mark read-only PT_LOAD mappings as shared regions
This makes Region::clone() do the right thing for these now that we
differentiate based on Region::is_shared().
2020-03-01 21:26:36 +01:00
Andreas Kling
ecfde5997b Kernel: Use SharedInodeVMObject for executables after all
I had the wrong idea about this. Thanks to Sergey for pointing it out!

Here's what he says (reproduced for posterity):

> Private mappings protect the underlying file from the changes made by
> you, not the other way around. To quote POSIX, "If MAP_PRIVATE is
> specified, modifications to the mapped data by the calling process
> shall be visible only to the calling process and shall not change the
> underlying object. It is unspecified whether modifications to the
> underlying object done after the MAP_PRIVATE mapping is established
> are visible through the MAP_PRIVATE mapping." In practice that means
> that the pages that were already paged in don't get updated when the
> underlying file changes, and the pages that weren't paged in yet will
> load the latest data at that moment.
> The only thing MAP_FILE | MAP_PRIVATE is really useful for is mapping
> a library and performing relocations; it's definitely useless (and
> actively harmful for the system memory usage) if you only read from
> the file.

This effectively reverts e2697c2ddd.
2020-03-01 21:16:27 +01:00
Andreas Kling
bb7dd63f74 Kernel: Run clang-format on Process.cpp 2020-03-01 21:16:27 +01:00
Andreas Kling
687b52ceb5 Kernel: Name perfcore files "perfcore.PID"
This way we can trace many things and we get one perfcore file per
process instead of everyone trying to write to "perfcore"
2020-03-01 20:59:02 +01:00
Andreas Kling
fee20bd8de Kernel: Remove some more harmless InodeVMObject miscasts 2020-03-01 12:27:03 +01:00
Andreas Kling
95e3aec719 Kernel: Fix harmless type miscast in Process::amount_clean_inode() 2020-03-01 11:23:23 +01:00
Andreas Kling
e2697c2ddd Kernel: Use PrivateInodeVMObject for loading program executables
This will be a memory usage pessimization until we actually implement
CoW sharing of the memory pages with SharedInodeVMObject.

However, it's a huge architectural improvement, so let's take it and
improve on this incrementally.

fork() should still be neutral, since all private mappings are CoW'ed.
2020-03-01 11:23:10 +01:00
Andreas Kling
88b334135b Kernel: Remove some Region construction helpers
It's now up to the caller to provide a VMObject when constructing a new
Region object. This will make it easier to handle things going wrong,
like allocation failures, etc.
2020-03-01 11:23:10 +01:00
Andreas Kling
4badef8137 Kernel: Return bytes written if sys$write() fails after writing some
If we wrote anything we should just inform userspace that we did,
and not worry about the error code. Userspace can call us again if
it wants, and we'll give them the error then.
2020-02-29 18:42:35 +01:00
Andreas Kling
7cd1bdfd81 Kernel: Simplify some dbg() logging
We don't have to log the process name/PID/TID, dbg() automatically adds
that as a prefix to every line.

Also we don't have to do .characters() on Strings passed to dbg() :^)
2020-02-29 13:39:06 +01:00
Andreas Kling
8fbdda5a2d Kernel: Implement basic support for sys$mmap() with MAP_PRIVATE
You can now mmap a file as private and writable, and the changes you
make will only be visible to you.

This works because internally a MAP_PRIVATE region is backed by a
unique PrivateInodeVMObject instead of using the globally shared
SharedInodeVMObject like we always did before. :^)

Fixes #1045.
2020-02-28 23:25:00 +01:00
Andreas Kling
aa1e209845 Kernel: Remove some unnecessary indirection in InodeFile::mmap()
InodeFile now directly calls Process::allocate_region_with_vmobject()
instead of taking an awkward detour via a special Region constructor.
2020-02-28 20:29:14 +01:00
Andreas Kling
651417a085 Kernel: Split InodeVMObject into two subclasses
We now have PrivateInodeVMObject and SharedInodeVMObject, corresponding
to MAP_PRIVATE and MAP_SHARED respectively.

Note that PrivateInodeVMObject is not used yet.
2020-02-28 20:20:35 +01:00
Andreas Kling
07a26aece3 Kernel: Rename InodeVMObject => SharedInodeVMObject 2020-02-28 20:07:51 +01:00
Andreas Kling
5af95139fa Kernel: Make Process::m_master_tls_region a WeakPtr
Let's not keep raw Region* variables around like that when it's so easy
to avoid it.
2020-02-28 14:05:30 +01:00
Andreas Kling
b0623a0c58 Kernel: Remove SmapDisabler in sys$connect() 2020-02-28 13:20:26 +01:00
Andreas Kling
dcd619bd46 Kernel: Merge the shbuf_get_size() syscall into shbuf_get()
Add an extra out-parameter to shbuf_get() that receives the size of the
shared buffer. That way we don't need to make a separate syscall to
get the size, which we always did immediately after.
2020-02-28 12:55:58 +01:00
Andreas Kling
f72e5bbb17 Kernel+LibC: Rename shared buffer syscalls to use a prefix
This feels a lot more consistent and Unixy:

    create_shared_buffer()   => shbuf_create()
    share_buffer_with()      => shbuf_allow_pid()
    share_buffer_globally()  => shbuf_allow_all()
    get_shared_buffer()      => shbuf_get()
    release_shared_buffer()  => shbuf_release()
    seal_shared_buffer()     => shbuf_seal()
    get_shared_buffer_size() => shbuf_get_size()

Also, "shared_buffer_id" is shortened to "shbuf_id" all around.
2020-02-28 12:55:58 +01:00
Liav A
db23703570 Process: Use dbg() instead of dbgprintf()
Also, fix a bad derefernce in sys$create_shared_buffer() method.
2020-02-27 13:05:12 +01:00
Andreas Kling
4997dcde06 Kernel: Always disable interrupts in do_killpg()
Will caught an assertion when running "kill 9999999999999" :^)
2020-02-27 11:05:16 +01:00
Andreas Kling
4a293e8a21 Kernel: Ignore signals sent to threadless (zombie) processes
If a process doesn't have any threads left, it's in a zombie state and
we can't meaningfully send signals to it. So just ignore them.

Fixes #1313.
2020-02-27 11:04:15 +01:00
Andreas Kling
0c1497846e Kernel: Don't allow profiling a dead process
Work towards #1313.
2020-02-27 10:42:31 +01:00
Cristian-Bogdan SIRB
05ce8586ea Kernel: Fix ASSERTION failed in join_thread syscall
set_interrupted_by_death was never called whenever a thread that had
a joiner died, so the joiner remained with the joinee pointer there,
resulting in an assertion fail in JoinBlocker: m_joinee pointed to
a freed task, filled with garbage.

Thread::current->m_joinee may not be valid after the unblock

Properly return the joinee exit value to the joiner thread.
2020-02-27 10:09:44 +01:00
Andreas Kling
d28fa89346 Kernel: Don't assert on sys$kill() with pid=INT32_MIN
On 32-bit platforms, INT32_MIN == -INT32_MIN, so we can't expect this
to always work:

    if (pid < 0)
        positive_pid = -pid; // may still be negative!

This happens because the -INT32_MIN expression becomes a long and is
then truncated back to an int.

Fixes #1312.
2020-02-27 10:02:04 +01:00
Cristian-Bogdan SIRB
717cd5015e Kernel: Allow process with multiple threads to call exec and exit
This allows a process wich has more than 1 thread to call exec, even
from a thread. This kills all the other threads, but it won't wait for
them to finish, just makes sure that they are not in a running/runable
state.

In the case where a thread does exec, the new program PID will be the
thread TID, to keep the PID == TID in the new process.

This introduces a new function inside the Process class,
kill_threads_except_self which is called on exit() too (exit with
multiple threads wasn't properly working either).

Inside the Lock class, there is the need for a new function,
clear_waiters, which removes all the waiters from the
Process::big_lock. This is needed since after a exit/exec, there should
be no other threads waiting for this lock, the threads should be simply
killed. Only queued threads should wait for this lock at this point,
since blocked threads are handled in set_should_die.
2020-02-26 13:06:40 +01:00
Andreas Kling
ceec1a7d38 AK: Make Vector use size_t for its size and capacity 2020-02-25 14:52:35 +01:00
Andreas Kling
d0f5b43c2e Kernel: Use Vector::unstable_remove() when deallocating a region
Process::m_regions is not sorted, so we can use unstable_remove()
to avoid shifting the vector contents. :^)
2020-02-24 18:34:49 +01:00
Andreas Kling
30a8991dbf Kernel: Make Region weakable and use WeakPtr<Region> instead of Region*
This turns use-after-free bugs into null pointer dereferences instead.
2020-02-24 13:32:45 +01:00
Andreas Kling
79576f9280 Kernel: Clear the region lookup cache on exec()
Each process has a 1-level lookup cache for fast repeated lookups of
the same VM region (which tends to be the majority of lookups.)
The cache is used by the following syscalls: munmap, madvise, mprotect
and set_mmap_name.

After a succesful exec(), there could be a stale Region* in the lookup
cache, and the new executable was able to manipulate it using a number
of use-after-free code paths.
2020-02-24 12:37:27 +01:00
Liav A
895e874eb4 Kernel: Include the new PIT class in system components 2020-02-24 11:27:03 +01:00
Andreas Kling
fc5ebe2a50 Kernel: Disown shared buffers on sys$execve()
When committing to a new executable, disown any shared buffers that the
process was previously co-owning.

Otherwise accessing the same shared buffer ID from the new program
would cause the kernel to find a cached (and stale!) reference to the
previous program's VM region corresponding to that shared buffer,
leading to a Region* use-after-free.

Fixes #1270.
2020-02-22 12:29:38 +01:00