This patch adds three separate per-process fault counters:
- Inode faults
An inode fault happens when we've memory-mapped a file from disk
and we end up having to load 1 page (4KB) of the file into memory.
- Zero faults
Memory returned by mmap() is lazily zeroed out. Every time we have
to zero out 1 page, we count a zero fault.
- CoW faults
VM objects can be shared by multiple mappings that make their own
unique copy iff they want to modify it. The typical reason here is
memory shared between a parent and child process.
This patch overloads Inode::is_directory() with a faster version that
doesn't require instantiating the whole InodeMetadata.
If you have an Ext2FSInode&, calling is_directory() should be instant
since we can just look directly at the raw inode bits.
If we want to make a new entry in the disk cache when it's completely
full of dirty blocks, we'll now synchronously flush the writes at that
point. Maybe it's not ideal, but at least we can keep going.
This way clients are not required to have instantiated ByteBuffers
and can choose whatever memory scheme works best for them.
Also converted some of the Ext2FS code to use stack buffers instead.
Instead of having DiskBackedFS allocate a ByteBuffer, leave it to each
client to provide buffer space.
This is significantly faster in many cases where we can use a stack
buffer and avoid heap allocation entirely.
The hashmap cache was ridiculously slow and hurt us more than it helped
us. This patch replaces it with a flat memory cache that keeps up to
10'000 blocks in cache with a simple dirty bit.
The syncd task will wake up periodically and call flush_writes() on all
file systems, which now causes us to traverse the cache and write all
dirty blocks to disk.
There's a ton of room for improvement here, but this itself is already
drastically better when doing repeated GCC invocations.
1) Off-by-one in block allocation when block size != 1 KB
Due to a quirk in the on-disk layout of ext2, file systems with a block
size of 1 KB have block #1 as their first block, while all others start
on block #0.
2) We had no fallback mechanism when the preferred group was full
We now allocate blocks from the preferred block group as long as it's
possible, and fall back to a simple scan through all groups when the
preferred one is full.
This is a freelist allocator with static size classes that works as a
complement to the generic kmalloc(). It's a lot faster than kmalloc()
since allocation just means popping from the freelist.
It's also significantly more compact when there are a lot of objects
smaller than the minimum kmalloc chunk size (32 bytes.)
This patch enables it for the Region and PhysicalPage classes.
In the PhysicalPage (8 bytes) case, it's a huge improvement since we
no longer waste 75% of the storage allocated.
There are also a number of ways this can be improved, so let's keep
working on it going forward.
Also added some assertions to DirectoryEntry in case someone tries to
instantiate them with names that would overflow the name buffer.
DirectoryEntry is a crappy data structure, and the name buffer is also
crappy. Added a FIXME about replacing it with something nicer.
Before this patch, the DirectoryEntry::name buffer would overflow if
you did "touch extremely-long-file-name". Duh.
Fixes#538.
This was a workaround to be able to build on case-insensitive file
systems where it might get confused about <string.h> vs <String.h>.
Let's just not support building that way, so String.h can have an
objectively nicer name. :^)
- TmpFSInode::write_bytes() needs to allow non-zero offsets
- TmpFSInode::read_bytes() wasn't respecting the offset
GCC puts the temporary files generated during compilation in /tmp,
so this exposed some bugs in TmpFS.
The complication is around /proc/sys/ variables, which were attached
to inodes. Now they're their own thing, and the corresponding inodes
are lazily created (as all other ProcFS inodes are) and simply refer
to them by index.
It is now possible to unmount file systems from the VFS via `umount`.
It works via looking up the `fsid` of the filesystem from the `Inode`'s
metatdata so I'm not sure how fragile it is. It seems to work for now
though as something to get us going.
This is more logical and allows us to solve the problem of
non-blocking TCP sockets getting stuck in SocketRole::None.
The only complication is that a single LocalSocket may be shared
between two file descriptions (on the connect and accept sides),
and should have two different roles depending from which side
you look at it. To deal with it, Socket::role() is made a
virtual method that accepts a file description, and LocalSocket
internally tracks which FileDescription is the which one and
returns a correct role.
Now that there can't be multiple clones of the same fd,
we only need to track whether or not an fd exists on each
side. Also there's no point in tracking connecting fds.
After a fork, the parent and the child are supposed to share
the same file description. For example, modifying the current
offset of a file description is visible in both of them.