In an early version of the huffman writing code, we always used 8 bits
here, and the comments still reflected that. Since we're now always
writing only as many bits as we need (in practice, still almost always
8), the comments are misleading.
Deflate and WebP can store at most 15 bits per symbol, meaning their
huffman trees can be at most 15 levels deep.
During construction, when we hit this level, we used to try again
with an ever lower frequency cap per symbol. This had the effect
of giving the symbols with the highest frequency lower frequencies
first, causing the most-frequent symbols to be merged. For example,
maybe the most-frequent symbol had 1 bit, and the 2nd-frequent
two bits (and everything else at least 3). With the cap, the two
most frequent symbols might both have 2 symbols, freeing up bits
for the lower levels of the tree.
This has the effect of making the most-frequent symbols longer at
first, which isn't great for file size.
Instead of using a frequency cap, ignore ever more of the low
bits of the frequency. This sacrifices resolution where it hurts
the lower levels of the tree first, and those are stored less
frequently.
For deflate, the 64 kiB block size means this doesn't have a big
effect, but for WebP it can have a big effect:
sunset-retro.png (876K): 2.02M -> 1.73M -- now (very slightly) smaller
than twice the input size! Maybe we'll be competitive one day.
(For wow.webp and 7z7c.webp, it has no effect, since we don't hit
the "tree too deep" case there, since those have relatively few
colors.)
No behavior change other than smaller file size. (No performance
cost either, and it's less code too.)
This implements some of basic webp compression: Huffman coding.
(The other parts of the basics are backreferences, and color cache
entries; and after that there are the four transforms -- predictor,
subtract green, color indexing, color.)
How much huffman coding helps depends on the input's entropy.
Constant-color channels are now encoded in constant space, but
otherwise a huffman code always needs at least one bit per symbol.
This means just huffman coding can at the very best reduce output
size to 1/8th of input size.
For three test input files:
sunset-retro.png (876K): 2.25M -> 2.02M
(helps fairly little; from 2.6x as big as the png input to 2.36x)
giphy.gif (184k): 11M -> 4.9M
(pretty decent, from 61x as big as the gif input to 27x as big)
7z7c.gif (11K): 775K -> 118K
(almost as good as possible an improvement for just huffman coding,
from 70x as big as the gif input to 10.7x as big)
No measurable encoding perf impact for encoding.
The code is pretty similar to Deflate.cpp in LibCompress, with just
enough differences that sharing code doesn't look like it's worth
it to me. I left comments outlining similarities.
We still construct the code length codes manually, and now we also
construct a PrefixCodeGroup manually that assigns 8 bits to all
symbols (except for fully-opaque alpha channels, and for the
unused distance codes, like before). But now we use the CanonicalCodes
from that PrefixCodeGroup for writing.
No behavior change at all, the output is bit-for-bit identical to
before. But this is a step towards actually huffman-coding symbols.
This is however a pretty big perf regression. For
`image -o test.webp test.bmp` (where test.bmp is retro-sunset.png
re-encoded as bmp), time goes from 23.7 ms to 33.2 ms.
`animation -o wow.webp giphy.gif` goes from 85.5 ms to 127.7 ms.
`animation -o wow.webp 7z7c.gif` goes from 12.6 ms to 16.5 ms.
* Matches how the loader is organized
* `compress_VP8L_image_data()` will grow longer when we add actual
compression
* Maybe someone wants to write a lossy compressor one day
No behavior change.