With the newly supported fuzzy matching in our test-web runner, we can
now define the expected maximum color channel and pixel count errors per
failing test and set a baseline they should not exceed.
The figures I added to these tests all come from my macOS M4 machine.
Most discrepancies seem to come from color calculations being slightly
off.