Previously we'd just dump those packets into the network adapter's
send queue and hope for the best. Instead we should wait until the peer
has sent TCP ACK packets.
Ideally this would parse the TCP window size option from the SYN or
SYN|ACK packet, but for now we just assume the window size is 64 kB.
Previously we'd allocate buffers when sending packets. This patch
avoids these allocations by using the NetworkAdapter's packet queue.
At the same time this also avoids copying partially constructed
packets in order to prepend Ethernet and/or IPv4 headers. It also
properly truncates UDP and raw IP packets.
Previously TCPSocket::send_tcp_packet() would try to send TCP packets
which matched whatever size the userspace program specified. We'd try to
break those packets up into smaller fragments, however a much better
approach is to limit TCP packets to the maximum segment size and
avoid fragmentation altogether.
Previously we didn't retransmit lost TCP packets which would cause
connections to hang if packets were lost. Also we now time out
TCP connections after a number of retransmission attempts.
Note that the changes to IPv4Socket::create are unfortunately needed as
the return type of TCPSocket::create and IPv4Socket::create don't match.
- KResultOr<NonnullRefPtr<TcpSocket>>>
vs
- KResultOr<NonnullRefPtr<Socket>>>
To handle this we are forced to manually decompose the KResultOr<T> and
return the value() and error() separately.
When we receive a TCP packet with a sequence number that is not what
we expected we have lost one or more packets. We can signal this to
the sender by sending a TCP ACK with the previous ack number so that
they can resend the missing TCP fragments.
When the MSS option header is missing the default maximum segment
size is 536 which results in lots of very small TCP packets that
NetworkTask has to handle.
This adds the MSS option header to outbound TCP SYN packets and
sets it to an appropriate value depending on the interface's MTU.
Note that we do not currently do path MTU discovery so this could
cause problems when hops don't fragment packets properly.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
The overrides of this function don't need to know how the original
packet was stored, so let's just give them a ReadonlyBytes view of
the raw packet data.
This makes most operations thread safe, especially so that they
can safely be used in the Kernel. This includes obtaining a strong
reference from a weak reference, which now requires an explicit
call to WeakPtr::strong_ref(). Another major change is that
Weakable::make_weak_ref() may require the explicit target type.
Previously we used reinterpret_cast in WeakPtr, assuming that it
can be properly converted. But WeakPtr does not necessarily have
the knowledge to be able to do this. Instead, we now ask the class
itself to deliver a WeakPtr to the type that we want.
Also, WeakLink is no longer specific to a target type. The reason
for this is that we want to be able to safely convert e.g. WeakPtr<T>
to WeakPtr<U>, and before this we just reinterpret_cast the internal
WeakLink<T> to WeakLink<U>, which is a bold assumption that it would
actually produce the correct code. Instead, WeakLink now operates
on just a raw pointer and we only make those constructors/operators
available if we can verify that it can be safely cast.
In order to guarantee thread safety, we now use the least significant
bit in the pointer for locking purposes. This also means that only
properly aligned pointers can be used.
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
We can now participate in the TCP connection closing handshake. :^)
This implementation is definitely not complete and needs to handle a
bunch of other cases. But it's a huge improvement over not being able
to close connections at all.
Note that we hold on to pending-close sockets indefinitely, until they
are moved into the Closed state. This should also have a timeout but
that's still a FIXME. :^)
Fixes#428.
The idea behind WeakPtr<NetworkAdapter> was to support hot-pluggable
network adapters, but on closer thought, that's super impractical so
let's not go down that road.
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
This approach is a bit naiive - whenever we send a packet out, we
check to see if there are any other packets we should try to send.
This works well enough for a busy connection but not very well for a
quiet one. Ideally we would check for not-acked packets on some kind
of timer, and use the length of this not-acked list as feedback to
throttle the writes coming from userspace.
This is comprised of five small changes:
* Keep a counter for tx/rx packets/bytes per TCP socket
* Keep a counter for tx/rx packets/bytes per network adapter
* Expose that data in /proc/net_tcp and /proc/netadapters
* Convert /proc/netadapters to JSON
* Fix up ifconfig to read the JSON from netadapters
This has several significant changes to the networking stack.
* Significant refactoring of the TCP state machine. Right now it's
probably more fragile than it used to be, but handles quite a lot
more of the handshake process.
* `TCPSocket` holds a `NetworkAdapter*`, assigned during `connect()` or
`bind()`, whichever comes first.
* `listen()` is now virtual in `Socket` and intended to be implemented
in its child classes
* `listen()` no longer works without `bind()` - this is a bit of a
regression, but listening sockets didn't work at all before, so it's
not possible to observe the regression.
* A file is exposed at `/proc/net_tcp`, which is a JSON document listing
the current TCP sockets with a bit of metadata.
* There's an `ETHERNET_VERY_DEBUG` flag for dumping packet's content out
to `kprintf`. It is, indeed, _very debug_.
After reading a bunch of POSIX specs, I've learned that a file descriptor
is the number that refers to a file description, not the description itself.
So this patch renames FileDescriptor to FileDescription, and Process now has
FileDescription* file_description(int fd).
Also run it across the whole tree to get everything using the One True Style.
We don't yet run this in an automated fashion as it's a little slow, but
there is a snippet to do so in makeall.sh.