LibWeb: Do not require multipart form data to end with CRLF
Some checks are pending
CI / macOS, arm64, Sanitizer, Clang (push) Waiting to run
CI / Linux, x86_64, Fuzzers, Clang (push) Waiting to run
CI / Linux, x86_64, Sanitizer, GNU (push) Waiting to run
CI / Linux, x86_64, Sanitizer, Clang (push) Waiting to run
Package the js repl as a binary artifact / Linux, arm64 (push) Waiting to run
Package the js repl as a binary artifact / macOS, arm64 (push) Waiting to run
Package the js repl as a binary artifact / Linux, x86_64 (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run

According to RFC 2046, the BNF of the form data body is:

    multipart-body := [preamble CRLF]
                      dash-boundary transport-padding CRLF
                      body-part *encapsulation
                      close-delimiter transport-padding
                      [CRLF epilogue]

Where "epilogue" is any text that "may be ignored or discarded". So we
should stop parsing the body once we encounter the terminating delimiter
("--").

Note that our parsing function is from an attempt to standardize the
grammar in the spec: https://andreubotella.github.io/multipart-form-data
This proposal hasn't been updated in ~4 years, and the fetch spec still
does not have a formal definition of the body string.
This commit is contained in:
Timothy Flynn 2025-09-15 09:23:21 -04:00 committed by Jelle Raaijmakers
commit 7b3465ab55
Notes: github-actions[bot] 2025-09-15 16:33:58 +00:00
3 changed files with 45 additions and 1 deletions

View file

@ -394,7 +394,9 @@ MultipartParsingErrorOr<Vector<XHR::FormDataEntry>> parse_multipart_form_data(JS
return MultipartParsingError { MUST(String::formatted("Expected `--` followed by boundary at position {}", lexer.tell())) };
// 2. If position points to the sequence of bytes 0x2D 0x2D 0x0D 0x0A (`--` followed by CR LF) followed by the end of input, return entry list.
if (lexer.next_is("--\r\n"sv))
// NOTE: We do not require the input to end with CRLF to match the behavior of other browsers. According to RFC 2046, we are to discard any
// text after the terminating `--`. See: https://datatracker.ietf.org/doc/html/rfc2046#page-22
if (lexer.next_is("--"sv))
return entry_list;
// 3. If position does not point to a sequence of bytes starting with 0x0D 0x0A (CR LF), return failure.