ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2025-10-19 14:40:18 +00:00

Author	SHA1	Message	Date
Andreas Kling	ad1f79fb4a	Kernel: Stop allocating page tables from the super pages pool We now use the regular "user" physical pages for on-demand page table allocations. This was by far the biggest source of super physical page exhaustion, so that bug should be a thing of the past now. :^) We still have super pages, but they are barely used. They remain useful for code that requires memory with a low physical address. Fixes #1000.	2020-01-17 22:34:36 +01:00
Andreas Kling	f71fc88393	Kernel: Re-enable protection of the kernel image in memory	2020-01-17 22:34:36 +01:00
Andreas Kling	59b584d983	Kernel: Tidy up the lowest part of the address space After MemoryManager initialization, we now only leave the lowest 1MB of memory identity-mapped. The very first (null) page is not present. All other pages are RW but not X. Supervisor only.	2020-01-17 22:34:36 +01:00
Andreas Kling	7e6f0efe7c	Kernel: Move Multiboot memory map parsing to its own function	2020-01-17 22:34:36 +01:00
Andreas Kling	ba8275a48e	Kernel: Clean up ensure_pte()	2020-01-17 22:34:36 +01:00
Andreas Kling	e362b56b4f	Kernel: Move kernel above the 3GB virtual address mark The kernel and its static data structures are no longer identity-mapped in the bottom 8MB of the address space, but instead move above 3GB. The first 8MB above 3GB are pseudo-identity-mapped to the bottom 8MB of the physical address space. But things don't have to stay this way! Thanks to Jesse who made an earlier attempt at this, it was really easy to get device drivers working once the page tables were in place! :^) Fixes #734.	2020-01-17 22:34:26 +01:00
Liav A	d2b41010c5	Kernel: Change Region allocation helpers We now can create a cacheable Region, so when map() is called, if a Region is cacheable then all the virtual memory space being allocated to it will be marked as not cache disabled. In addition to that, OS components can create a Region that will be mapped to a specific physical address by using the appropriate helper method.	2020-01-14 15:38:58 +01:00
Andreas Kling	62c45850e1	Kernel: Page allocation should not use memset_user() when zeroing We're not zeroing new pages through a userspace address, so this should not use memset_user().	2020-01-10 10:57:33 +01:00
Andreas Kling	8e7420ddf2	Kernel: Harden memory mapping of the kernel image We now map the kernel's text and rodata segments read+execute. We also make the data and bss segments non-executable. Thanks to q3k for the idea! :^)	2020-01-06 13:55:39 +01:00
Andreas Kling	9eef39d68a	Kernel: Start implementing x86 SMAP support Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that prevents the kernel from accessing userspace memory. With SMAP enabled, trying to read/write a userspace memory address while in the kernel will now generate a page fault. Since it's sometimes necessary to read/write userspace memory, there are two new instructions that quickly switch the protection on/off: STAC (disables protection) and CLAC (enables protection.) These are exposed in kernel code via the stac() and clac() helpers. There's also a SmapDisabler RAII object that can be used to ensure that you don't forget to re-enable protection before returning to userspace code. THis patch also adds copy_to_user(), copy_from_user() and memset_user() which are the "correct" way of doing things. These functions allow us to briefly disable protection for a specific purpose, and then turn it back on immediately after it's done. Going forward all kernel code should be moved to using these and all uses of SmapDisabler are to be considered FIXME's. Note that we're not realizing the full potential of this feature since I've used SmapDisabler quite liberally in this initial bring-up patch.	2020-01-05 18:14:51 +01:00
Andreas Kling	0f9800ca57	Kernel: Make the loop that marks the bottom 1MB NX a little less busy	2020-01-02 22:02:29 +01:00
Andreas Kling	32ec1e5aed	Kernel: Mask kernel addresses in backtraces and profiles Addresses outside the userspace virtual range will now show up as 0xdeadc0de in backtraces and profiles generated by unprivileged users.	2020-01-02 20:51:31 +01:00
Andreas Kling	3dcec260ed	Kernel: Validate the full range of user memory passed to syscalls We now validate the full range of userspace memory passed into syscalls instead of just checking that the first and last byte of the memory are in process-owned regions. This fixes an issue where it was possible to avoid rejection of invalid addresses that sat between two valid ones, simply by passing a valid address and a size large enough to put the end of the range at another valid address. I added a little test utility that tries to provoke EFAULT in various ways to help verify this. I'm sure we can think of more ways to test this but it's at least a start. :^) Thanks to mozjag for pointing out that this code was still lacking! Incidentally this also makes backtraces work again. Fixes #989.	2020-01-02 02:17:12 +01:00
Andreas Kling	5aeaab601e	Kernel: Move CPU feature detection to Arch/x86/CPU.{cpp.h} We now refuse to boot on machines that don't support PAE since all of our paging code depends on it. Also let's only enable SSE and PGE support if the CPU advertises it.	2020-01-01 12:57:00 +01:00
Andreas Kling	8602fa5b49	Kernel: Enable x86 SMEP (Supervisor Mode Execution Protection) This prevents the kernel from jumping to code in userspace memory.	2020-01-01 01:59:52 +01:00
Andreas Kling	c9ec415e2f	Kernel: Always reject never-userspace addresses before checking regions At the moment, addresses below 8MB and above 3GB are never accessible to userspace, so just reject them without even looking at the current process's memory regions.	2019-12-31 03:45:54 +01:00
Andreas Kling	66d5ebafa6	Kernel: Let's also not consider kernel regions to be valid user stacks This one is less obviously exploitable than the previous one, but still a bug nonetheless.	2019-12-31 00:28:14 +01:00
Andreas Kling	0fc24fe256	Kernel: User pointer validation should reject kernel-only addresses We were happily allowing syscalls with pointers into kernel-only regions (virtual address >= 0xc0000000). This patch fixes that by only considering user regions in the current process, and also double-checking the Region::is_user_accessible() flag before approving an access. Thanks to Fire30 for finding the bug! :^)	2019-12-31 00:24:35 +01:00
Andreas Kling	c1f8291ce4	Kernel: When physical page allocation fails, try to purge something Instead of panicking right away when we run out of physical pages, we now try to find a PurgeableVMObject with some volatile pages in it. If we find one, we purge that entire object and steal one of its pages. This makes it possible for the kernel to keep going instead of dying. Very cool. :^)	2019-12-26 11:45:36 +01:00
Conrad Pankoff	17aef7dc99	Kernel: Detect support for no-execute (NX) CPU features Previously we assumed all hosts would have support for IA32_EFER.NXE. This is mostly true for newer hardware, but older hardware will crash and burn if you try to use this feature. Now we check for support via CPUID.80000001[20].	2019-12-26 10:05:51 +01:00
Andreas Kling	9e55bcb7da	Kernel: Make kernel memory regions be non-executable by default From now on, you'll have to request executable memory specifically if you want some.	2019-12-25 22:41:34 +01:00
Andreas Kling	0b7a2e0a5a	Kernel: Set NX bit for virtual addresses 0-1MB and 2-8MB This removes the ability to jump into kmalloc memory, etc. Only the kernel image itself is allowed to exec, located between 1-2MB.	2019-12-25 22:24:28 +01:00
Andreas Kling	ce5f7f6c07	Kernel: Use the CPU's NX bit to enforce PROT_EXEC on memory mappings Now that we have PAE support, we can ask the CPU to crash processes for trying to execute non-executable memory. This is pretty cool! :^)	2019-12-25 13:35:57 +01:00
Andreas Kling	52deb09382	Kernel: Enable PAE (Physical Address Extension) Introduce one more (CPU) indirection layer in the paging code: the page directory pointer table (PDPT). Each PageDirectory now has 4 separate PageDirectoryEntry arrays, governing 1 GB of VM each. A really neat side-effect of this is that we can now share the physical page containing the >=3GB kernel-only address space metadata between all processes, instead of lazily cloning it on page faults. This will give us access to the NX (No eXecute) bit, allowing us to prevent execution of memory that's not supposed to be executed.	2019-12-25 13:35:57 +01:00
Andreas Kling	c087abc48d	Kernel: Rename PageDirectory::find_by_pdb() => find_by_cr3() I caught myself wondering what "pdb" stood for, so let's rename this to something more obvious.	2019-12-25 02:58:03 +01:00
Andreas Kling	c9a5253ac2	Kernel: Uh, actually actually turn on CR4.PGE I'm not sure how I managed to misread the location of this bit twice. But I did! Here is finally the correct value, according to Intel: "Page Global Enable (bit 7 of CR4)" Jeez! :^)	2019-12-25 02:58:03 +01:00
Andreas Kling	3623e35978	Kernel: Oops, actually enable CR4.PGE (page table global bit) Turns out we were setting the wrong bit here. Now we will actually keep kernel memory mappings in the TLB across context switches.	2019-12-24 22:45:27 +01:00
Andreas Kling	ae2d72377d	Kernel: Enable the x86 WP bit to catch invalid memory writes in ring 0 Setting this bit will cause the CPU to generate a page fault when writing to read-only memory, even if we're executing in the kernel. Seemingly the only change needed to make this work was to have the inode-backed page fault handler use a temporary mapping for writing the read-from-disk data into the newly-allocated physical page.	2019-12-21 16:21:13 +01:00
Andreas Kling	62c2309336	Kernel: Fix some warnings about passing non-POD to kprintf	2019-12-20 20:19:46 +01:00
Andreas Kling	b6ee8a2c8d	Kernel: Rename vmo => vmobject everywhere	2019-12-19 19:15:27 +01:00
Andreas Kling	0a75a46501	Kernel: Make sure the kernel info page is read-only for userspace To enforce this, we create two separate mappings of the same underlying physical page. A writable mapping for the kernel, and a read-only one for userspace (the one returned by sys$get_kernel_info_page.)	2019-12-15 22:21:28 +01:00
Andreas Kling	9ad151c665	Kernel: Improve comment about the system virtual memory map a bit	2019-12-15 16:13:08 +01:00
Andreas Kling	cde0a1eeb5	Kernel: Put some debug spam behind PAGE_FAULT_DEBUG	2019-12-01 16:03:24 +01:00
Andreas Kling	e56daf547c	Kernel: Disallow syscalls from writeable memory Processes will now crash with SIGSEGV if they attempt making a syscall from PROT_WRITE memory. This neat idea comes from OpenBSD. :^)	2019-11-29 16:30:05 +01:00
Andreas Kling	2d1bcce34a	Kernel: Fix triple-fault when clicking on SystemServer in SystemMonitor The fault was happening when retrieving a current backtrace for the SystemServer process. To generate a backtrace, we go into the paging scope of the process, meaning we temporarily switch to using its page directory as our own. Because kernel VM is allocated on demand, it's possible for a process's mappings above the 3GB mark to be out-of-date. Normally this just gets fixed up transparently by the page fault handler (which simply copies the PDE from the canonical MM.kernel_page_directory() into the current process.) However, if the current kernel stack is in a piece of memory that the backtraced process lacks up-to-date PDE's for, we still get a page fault, but are unable to handle it, since the CPU wants to push to the stack as part of calling the page fault handler. So we're screwed and it's a triple-fault. Fix this by always updating the kernel VM mappings before switching into a paging scope. In practical terms, this is a 1KB memcpy() that happens when generating a backtrace, or doing exec().	2019-11-27 12:40:42 +01:00
Andreas Kling	9a157b5e81	Revert "Kernel: Move Kernel mapping to 0xc0000000" This reverts commit `bd33c66273`. This broke the network card drivers, since they depended on kmalloc addresses being identity-mapped.	2019-11-23 17:27:09 +01:00
Jesse Buhagiar	bd33c66273	Kernel: Move Kernel mapping to 0xc0000000 The kernel is now no longer identity mapped to the bottom 8MiB of memory, and is now mapped at the higher address of `0xc0000000`. The lower ~1MiB of memory (from GRUB's mmap), however is still identity mapped to provide an easy way for the kernel to get physical pages for things such as DMA etc. These could later be mapped to the higher address too, as I'm not too sure how to go about doing this elegantly without a lot of address subtractions.	2019-11-22 16:23:23 +01:00
Andreas Kling	794758df3a	Kernel: Implement some basic stack pointer validation VM regions can now be marked as stack regions, which is then validated on syscall, and on page fault. If a thread is caught with its stack pointer pointing into anything that's not a Region with its stack bit set, we'll crash the whole process with SIGSTKFLT. Userspace must now allocate custom stacks by using mmap() with the new MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now we have it too, yay! :^)	2019-11-17 12:15:43 +01:00
Liav A	bce510bf6f	Kernel: Fix the search method of free userspace physical pages (#742 ) Now the userspace page allocator will search through physical regions, and stop the search as it finds an available page. Also remove an "address of" sign since we don't need that when counting size of physical regions	2019-11-08 22:39:29 +01:00
supercomputer7	c3c905aa6c	Kernel: Removing hardcoded offsets from Memory Manager Now the kernel page directory and the page tables are located at a safe address, to prevent from paging data colliding with garbage.	2019-11-08 17:38:23 +01:00
Andreas Kling	19398cd7d5	Kernel: Reorganize memory layout a bit Move the kernel image to the 1 MB physical mark. This prevents it from colliding with stuff like the VGA memory. This was causing us to end up with the BIOS screen contents sneaking into kernel memory sometimes. This patch also bumps the kmalloc heap size from 1 MB to 3 MB. It's not the perfect permanent solution (obviously) but it should get the OOM monkey off our backs for a while.	2019-11-04 12:04:35 +01:00
Andreas Kling	d67c6a92db	Kernel: Move page fault handling from MemoryManager to Region After the page fault handler has found the region in which the fault occurred, do the rest of the work in the region itself. This patch also makes all fault types consistently crash the process if a new page is needed but we're all out of pages.	2019-11-04 00:47:03 +01:00
Andreas Kling	0e8f1d7cb6	Kernel: Don't expose a region's page directory to the outside world Now that region manages its own mapping/unmapping, there's no need for the outside world to be able to grab at its page directory.	2019-11-04 00:26:00 +01:00
Andreas Kling	9b2dc36229	Kernel: Merge MemoryManager::map_region_at_address() into Region::map()	2019-11-04 00:05:57 +01:00
Andreas Kling	98b328754e	Kernel: Fix bad setup of CoW faults for offset regions Regions with an offset into their VMObject were incorrectly adding the page offset when indexing into the CoW bitmap.	2019-11-03 23:54:35 +01:00
Andreas Kling	5b7f8634e3	Kernel: Set the G (global) bit for kernel page tables Since the kernel page tables are shared between all processes, there's no need to (implicitly) flush the TLB for them on every context switch. Setting the G bit on kernel page tables allows the CPU to keep the translation caches around.	2019-11-03 23:51:55 +01:00
Andreas Kling	4bf1a72d21	Kernel: Teach Region how to remap itself Now remapping (i.e flushing kernel metadata to the CPU page tables) is done by simply calling Region::remap().	2019-11-03 21:11:08 +01:00
Andreas Kling	3dce0f23f4	Kernel: Regions should be mapped into a PageDirectory, not a Process This patch changes the parameter to Region::map() to be a PageDirectory since that matches how we think about the memory model: Regions are views onto VMObjects, and are mapped into PageDirectories. Each Process has a PageDirectory. The kernel also has a PageDirectory.	2019-11-03 21:11:08 +01:00
Andreas Kling	2cfc43c982	Kernel: Move region map/unmap operations into the Region class The more Region can take care of itself, the better.	2019-11-03 21:11:08 +01:00
Andreas Kling	a221cddeec	Kernel: Clean up a bunch of wrong-looking Region/VMObject code Since a Region is merely a "window" onto a VMObject, it can both begin and end at a distance from the VMObject's boundaries. Therefore, we should always be computing indices into a VMObject's physical page array by adding the Region's "first_page_index()". There was a whole bunch of code that forgot to do that. This fixes many wrong behaviors for Regions that start part-way into a VMObject.	2019-11-03 15:44:13 +01:00

1 2 3

109 commits