Bit reversals are used very often in intra-predicted frames. Turning
these into a constexpr lookup table reduces the branching needed for
block transforms significantly. This reduces the times spent decoding
an intra-heavy 1080p video by about 9% (~14.3s -> ~12.9s).
This moves all the frame size calculation to `FrameContext`, where the
subsampling is easily accessible to determine the size for each plane.
The internal framebuffer size has also been reduced to the exact frame
size that is output.
Like the non-zero tokens and segmentation IDs, these can be moved into
the tile decoding loop for above context and allocated by TileContext
for left context.
The array containing the vertical line of bools indicating whether non-
zero tokens were decoded in each sub-block is moved to TileContext, and
a span of the valid range for a block to read and write to is created
when we construct a BlockContext.
Previously, the variables were named similarly to the names in spec
which aren't very human-readable. This adds some utility functions for
dimensional unit conversions and names the variables in residual()
based on their units.
References to 4x4 blocks were also renamed to call them sub-blocks
instead, since unit conversion functions would not be able to begin
with "4x4_blocks".
The first keyframe of the test video can be decoded with these changes.
Raw memory allocations in the Parser have been replaced with Vector or
Array to avoid memory leaks and OOBs.
This was required for correctly parsing more than one frame's
height/width data properly. Additionally, start handling failure
a little more gracefully. Since we don't fully parse a tile before
starting to parse the next tile, we will now no longer make it past
the first tile mark, meaning we should not handle that scenario well.