It's now possible to get purgeable memory by using mmap(MAP_PURGEABLE).
Purgeable memory has a "volatile" flag that can be set using madvise():
- madvise(..., MADV_SET_VOLATILE)
- madvise(..., MADV_SET_NONVOLATILE)
When in the "volatile" state, the kernel may take away the underlying
physical memory pages at any time, without notifying the owner.
This gives you a guilt discount when caching very large things. :^)
Setting a purgeable region to non-volatile will return whether or not
the memory has been taken away by the kernel while being volatile.
Basically, if madvise(..., MADV_SET_NONVOLATILE) returns 1, that means
the memory was purged while volatile, and whatever was in that piece
of memory needs to be reconstructed before use.
Using int was a mistake. This patch changes String, StringImpl,
StringView and StringBuilder to use size_t instead of int for lengths.
Obviously a lot of code needs to change as a result of this.
The main thread of each kernel/user process will take the name of
the process. Extra threads will get a fancy new name
"ProcessName[<tid>]".
Thread backtraces now list the thread name in addtion to tid.
Add the thread name to /proc/all (should it get its own proc
file?).
Add two new syscalls, set_thread_name and get_thread_name.
This patch adds these I/O counters to each thread:
- (Inode) file read bytes
- (Inode) file write bytes
- Unix socket read bytes
- Unix socket write bytes
- IPv4 socket read bytes
- IPv4 socket write bytes
These are then exposed in /proc/all and seen in SystemMonitor.
This defaults to 1500 for all adapters, but LoopbackAdapter increases
it to 65536 on construction.
If an IPv4 packet is larger than the MTU, we'll need to break it into
smaller fragments before transmitting it. This part is a FIXME. :^)
Turns out we can use abi::__cxa_demangle() for this, and all we need to
provide is sprintf(), realloc() and free(), so this patch exposes them.
We now have fully demangled C++ backtraces :^)
Previously it was not possible to see what each thread in a process was
up to, or how much CPU it was consuming. This patch fixes that.
SystemMonitor and "top" now show threads instead of just processes.
"ps" is gonna need some more fixing, but it at least builds for now.
Fixes#66.
This adds a call to set_metadata_dirty(true) to
Ext2FS::write_directory(). This fixes a bug wherein InodeWatchers
weren't alerted on directory updates.
Scheduling priority is now set at the thread level instead of at the
process level.
This is a step towards allowing processes to set different priorities
for threads. There's no userspace API for that yet, since only the main
thread's priority is affected by sched_setparam().
Files opened with O_DIRECT will now bypass the disk cache in read/write
operations (though metadata operations will still hit the disk cache.)
This will allow us to test actual disk performance instead of testing
disk *cache* performance, if that's what we want. :^)
There's room for improvment here, we're very aggressively flushing any
dirty cache entries for the specific block before reading/writing that
block. This is done by walking the entire cache, which may be slow.
Don't keep Inodes around in memory forever after we've interacted with
them once. This is a slight performance pessimization when accessing
the same file repeatedly, but closing it for a while in between.
Longer term we should find a way to keep a limited number of unused
Inodes cached, whichever ones we think are likely to be used again.
When creating a new directory, we set the initial size to 1 block.
This meant that we were allocating a block up front, but the Inode's
internal block list cache was not populated with this block.
This broke write_bytes() on a new directory, since it assumed that
the block list cache would be up to date if the call to write_bytes()
would not change the directory's size.
This patch fixes the issue in two ways: First, we cache the initial
block list created for new directories.
Second, we now repopulate the block list cache in write_bytes() if it
is empty when we get there. This is basically just a safety fallback
to avoid having this kind of bug in the future.
We can't be calling the virtual FS::flush_writes() in order to flush
the disk cache from within the disk cache, since an FS subclass may
try to do cache stuff in its flush_writes() implementation.
Instead, separate out the implementation of DiskBackedFS's flushing
logic into a flush_writes_impl() and call that from the cache code.
Also cache the block group descriptor table in a KBuffer on file system
initialization, instead of on first access.
This reduces pressure on the kmalloc heap somewhat.
We were writing out the full block list whenever Ext2FSInode::resize()
was called, even if the old and new sizes were identical.
This patch makes it a no-op, which drastically improves "cp" speed
since we now take full advantage of the up-front call to ftruncate().
If there are not enough free blocks in the filesystem to accomodate
growing an Inode, we should fail with ENOSPC before even starting to
allocate blocks.
Add a simple cache to Ext2FS where we keep block bitmaps along with a
dirty bit. This allows us to coalesce bitmap flushes, giving us a nice
~3x improvement in disk_benchmark write speeds.
Store the cached super block as an ext2_super_block member instead of
caching it in a ByteBuffer and using a casting helper everywhere.
This patch also combines reading/writing of the super block into a
single disk device operation (instead of two.)
Add dedicated internal types for Int64 and UnsignedInt64. This makes it
a bit more straightforward to work with 64-bit numbers (instead of just
implicitly storing them as doubles.)
I have no idea why this was here. It makes no sense. If you're trying
to find out if something is a directory, why wouldn't you be allowed to
ask that about a FIFO? :^)
Thanks to Brandon for spotting this!
Also, while we're here, cache the directory state in a bool member so
we don't have to keep fetching inode metadata when checking this
repeatedly. This is important since sys$read() now calls it.
Previously, procfs$pid_fds would return nothing when called
for a process that had either no open files or a non-existent
handle. This could cause problems when a userspace program
expected a valid Json response.
Procfs$pid_fs now returns an empty array in the aforementioned
cases.
This reverts commit 1cca5142af.
This appears to be causing intermittent triple-faults and I don't know
why yet, so I'll just revert it to keep the tree in decent shape.
Background: DoubleBuffer is a handy buffer class in the kernel that
allows you to keep writing to it from the "outside" while the "inside"
reads from it. It's used for things like LocalSocket and PTY's.
Internally, it has a read buffer and a write buffer, but the two will
swap places when the read buffer is exhausted (by reading from it.)
Before this patch, it was internally implemented as two Vector<u8>
that we would swap between when the reader side had exhausted the data
in the read buffer. Now instead we preallocate a large KBuffer (64KB*2)
on DoubleBuffer construction and use that throughout its lifetime.
This removes all the kmalloc heap traffic caused by DoubleBuffers :^)
The Cache entries found in `DiskBackedFileSystem` are now stored in a
`KBuffer` object, instead of relying on `kmalloc_eternal`. The number
of entries was exceeding that of the number of bytes allocated to
`kmalloc_eternal`, which in turn caused `mount()` to fail epically
when called.
This patch adds three separate per-process fault counters:
- Inode faults
An inode fault happens when we've memory-mapped a file from disk
and we end up having to load 1 page (4KB) of the file into memory.
- Zero faults
Memory returned by mmap() is lazily zeroed out. Every time we have
to zero out 1 page, we count a zero fault.
- CoW faults
VM objects can be shared by multiple mappings that make their own
unique copy iff they want to modify it. The typical reason here is
memory shared between a parent and child process.