This needed the same `jbig2` changes as for the non-transposed ones,
and the changes to it mentioned on #23780.
I used the same .ini files as for the non-transposed ones, except
that I added `-txt -Param -Transposed 1` as last line to each of them.
All three new files display fine in Chrome.
They all look busted in Firefox.
I think this is likey a bug in pdf.js that I'll report upstream.
(Reportedly they look fine in Acrobat on Android.)
This already worked fine. Now it's tested.
I did have to teach `jbig2` to correctly generate test files for this.
See the PR adding these tests for local changes.
I used the script from #23659 to create these images, but I replaced
these lines:
```
-txt -Param -numInst 4
-ID 2 108 50 -ID 3 265 60 -ID 1 100 135 -ID 0 70 232
-txt -Param -RefCorner 2
```
For `bottomleft`, I replaced them with:
```
-txt -Param -numInst 4
-ID 2 137 50 -ID 3 294 60 -ID 1 199 135 -ID 0 319 232
-txt -Param -RefCorner 0
```
For `bottomright`, I replaced them with:
```
-txt -Param -numInst 4
-ID 2 108 50 -ID 3 265 60 -ID 1 100 135 -ID 0 70 232
-txt -Param -RefCorner 2
```
For `topright`, I replaced them with:
```
-txt -Param -numInst 4
-ID 2 108 79 -ID 3 265 89 -ID 1 100 234 -ID 0 70 351
-txt -Param -RefCorner 3
```
All three new files display fine in Chrome.
The bottomleft one displays fine in Firefox, while the other two
look compressed in X. I think this is a bug in pdf.js that I'll
report upstream.
(Reportedly they look fine in Acrobat on Android.)
See the PR adding this test for local changes to `jbig2`.
I used the shell script mentioned in #23659, except I added the line
`-txt -Param -Transposed 1` at the very end of the .ini file.
As with all the symbol test cases, after running
Meta/jbig2_to_pdf.py -o foo.pdf foo.jb2 399 400
the file opens up ok in Chrome and Firefox (but not Safari), so
maybe it's not completely broken.
The T.800 spec says there should only be one 'colr' box, but the
extended jpx file format spec in T.801 annex M allows having multiple.
Method 2 is a basic ICC profile, while method 3 (jpx-only) allows full
ICC profiles. Support that.
For the test, I opened buggie.png in Photoshop, converted it to
grayscale, and saved it as a JPEG2000, with "JP2 Compatible" checked
and "Include Transparency" unchecked. I also unchecked "Include
Metadata", and "Lossless". I left "Fast Mode" checked and the quality
at the default 50.
This adds a test for the code added in #23710.
I created this file using `jbig2` (see below for details), but as
usual it required a bunch of changes to it to make it actually produce
spec-compliant output. See the PR adding this image for my local diff.
I created the test image file by running this shell script with
`jbig2` tweaked as described above:
#!/bin/bash
set -eu
S=Tests/LibGfx/test-inputs/bmp/bitmap.bmp
# See make-symbol-jbig.sh (the script in #23659) for the general
# setup and some comments. See also make-symbol-textrefine.sh (in
# #23713).
#
# `-Ref` takes 5 arguments:
# 1. The symbol ID of this symbol (like after a `-Simple`)
# 2. A bmp file that the base symbol gets refined to
# 3. The ID of the base symbol
# 4. dx, dy
cat << EOF > jbig2-symbol-symbolrefine.ini
-sym -Seg 1
-sym -file -numClass -HeightClass 3 -WidthClass 1
-sym -file -numSymbol 3
-sym -file -Height 250
-sym -file -Width 120 -Simple 0 mouth-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -file -Height 100
-sym -file -Width 100 -Simple 1 nose-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -file -Height 30
-sym -file -Width 30 -Simple 2 top_eye-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -Param -Huff_DH 0
-sym -Param -Huff_DW 0
-sym -Seg 2
-sym -file -numClass -HeightClass 1 -WidthClass 1
-sym -file -numSymbol 1
-sym -file -Height 30
-sym -file -Width 30 -Ref 3 bottom_eye-1bpp.bmp 2 0 0
-sym -file -EndOfHeightClass
-sym -Param -Huff_DH 0
-sym -Param -Huff_DW 0
-sym -Param -RefTemplate 1
-txt -Seg 3
-txt -Param -numInst 4
-ID 2 108 50 -ID 3 265 60 -ID 1 100 135 -ID 0 70 232
-txt -Param -RefCorner 1
-txt -Param -Xlocation 0
-txt -Param -Ylocation 0
-txt -Param -W 399
-txt -Param -H 400
EOF
J=$HOME/Downloads/T-REC-T.88-201808-I\!\!SOFT-ZST-E/Software
J=$J/JBIG2_SampleSoftware-A20180829/source/jbig2
$J -i "${S%.bmp}" -f bmp -o symbol-symbolrefine -F jb2 \
-ini jbig2-symbol-symbolrefine.ini
We can't decode any actual image data yet, but it shows that we can
read the basics of the container format. (...as long as there's an
Annex I container around the data, not just an Annex A codestream.
All files I've found so far have the container.)
I drew the thes input in Acorn.app and used "Save as..." to save it as
JPEG2000. It's an RGBA image.
This adds a test for the code added in #23696.
I created this file using `jbig2` (see below for details), but as
usual it required a bunch of changes to it to make it actually produce
spec-compliant output. See the PR adding this image for my local diff.
I created the test image file by running this shell script with
`jbig2` tweaked as described above:
#!/bin/bash
set -eu
S=Tests/LibGfx/test-inputs/bmp/bitmap.bmp
# See make-symbol-jbig.sh (the script in #23659) for the general
# setup and some comments. Note that the symbol section here only
# has 3 symbols, instead of 4 over there.
#
# `-RefID` takes 6 arguments:
# 1. The symbol ID of the base symbol (like after an `-ID`)
# 2. A bmp file that the base symbol gets refined to
# 3. y, x (like after an `-ID`)
# 4. dx, dy (note swapped order to previous item)
#
# We also explicitly set refinement adaptive pixels, because the
# default adaptive refinement pixels aren't the nominal pixels from
# the spec.
cat << EOF > jbig2-symbol-textrefine.ini
-sym -Seg 1
-sym -file -numClass -HeightClass 3 -WidthClass 1
-sym -file -numSymbol 3
-sym -file -Height 250
-sym -file -Width 120 -Simple 0 mouth-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -file -Height 100
-sym -file -Width 100 -Simple 1 nose-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -file -Height 30
-sym -file -Width 30 -Simple 2 top_eye-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -Param -Huff_DH 0
-sym -Param -Huff_DW 0
-txt -Seg 2
-txt -Param -numInst 4
-ID 2 108 50 -RefID 2 bottom_eye-1bpp.bmp 265 60 0 0
-ID 1 100 135 -ID 0 70 232
-txt -Param -RefCorner 1
-txt -Param -Xlocation 0
-txt -Param -Ylocation 0
-txt -Param -W 399
-txt -Param -H 400
-txt -Param -rATX1 -1
-txt -Param -rATY1 -1
-txt -Param -rATX2 -1
-txt -Param -rATY2 -1
EOF
J=$HOME/Downloads/T-REC-T.88-201808-I\!\!SOFT-ZST-E/Software
J=$J/JBIG2_SampleSoftware-A20180829/source/jbig2
$J -i "${S%.bmp}" -f bmp -o symbol-textrefine -F jb2 -ini \
jbig2-symbol-textrefine.ini
Template 2 is needed by some symbols in 0000372.pdf page 11 and
0000857.pdf pages 1-4. Implement the others too while here. (The
mentioned pages in those two PDFs also use the "end of stripe" segment,
so they still don't render yet.
We still don't support EXTTEMPLATE.
This extracts the bitbuffer combining code we had into a new function
composite_bitbuffer() and adds the following features:
* Real support for combination operators (which also lets us allow black
as background color again, even if that's never used in practice)
* Clipping support (not used here yet, but will be needed elsewhere
soon)
We're going to need this for text segment handling.
No behavior change.
It seems to do the right thing already, and nothing in the spec says
not to do this as far as I can tell.
With this, we can finally decode
Tests/LibGfx/test-inputs/jbig2/bitmap.jbig2 and add a test for
decoding simple arithmetic-coded images.
In practice, everything uses white backgrounds and operators `or`
or `xor` to turn them black, at least for the simple images we're
about to be able to decode.
To make sure we don't forget implementing this for real once needed,
reject other ops, and also reject black backgrounds (because 1 | 0
is 1, not 0 like our overwrite implementation will produce).
This means we have to remove a test, but since this scenario doesn't
seem to happen in practice, that seems ok.
The context can vary for every bit we read.
This does not affect the one use in the test which reuses the same
context for all bits, but it is necessary for future changes.
I think the context normally changes for every bit. But this here
is enough to correctly decode the test bitstream in Annex H.2 in
the spec, which seems like a good checkpoint.
The internals of the decoder use spec naming, to make the code
look virtually identical to what's in the spec. (Even so, I managed
to put in several typos that took a while to track down.)
With this, `image` can convert any jbig2 file, as long as it's
black (or white), and LibPDF can draw jbig2 files (again, as long
as they only contain a single color stored in just a
PageInformation segment).
This allows `file` to correctly print the dimensions of a .jbig2 file,
and it allows us to write a test that covers much of all the code
written so far.
This is for validating that a decoder with a weak or nonexistent
sniff() method thinks it can decode an image. This should not be
treated as an error.
No behavior change.
The semantics of BGRx8888 aren't super clear and it means different
things for different parts of the codebase. In particular, the PNG
writer still writes the x channel to the alpha channel of its output.
In BMPs, the 4th palette byte is usually 0, which means after #21412 we
started writing all .bmp files with <= 8bpp as completely transparent
to PNGs.
This works around that.
(See also #19464 for previous similar workarounds.)
The added `bitmap.bmp` is a 1bpp file I drew in Photoshop and saved
using its "Save as..." saving path.
A tile is basically a strip with a user-defined width. With that in
mind, adding support for them is quite straightforward. As a lot the
common code was named after 'strips', to avoid future confusion I
renamed everything that interact with either strips or tiles to a
global term: 'segment'.
Note that tiled images are supposed to always have a 'TileOffsets' tag
instead of 'StripOffset'. However, this doesn't seem to be enforced by
encoders, so we support having either of them indifferently.
The test case was generated with the following Python script:
import pyvips
img = pyvips.Image.new_from_file('deflate.tiff')
img.write_to_file('tiled.tiff',
compression=pyvips.ForeignTiffCompression.DEFLATE,
tile=True, tile_width=64, tile_height=64)
JPEGs can store a `restart_interval`, which controls how many
minimum coded units (MCUs) apart the stream state resets.
This can be used for error correction, decoding parts of a jpeg
in parallel, etc.
We tried to use
u32 i = vcursor * context.mblock_meta.hpadded_count + hcursor;
i % (context.dc_restart_interval *
context.sampling_factors.vertical *
context.sampling_factors.horizontal) == 0
to check if we hit a multiple of an MCU.
`hcursor` is the horizontal offset into 8x8 blocks, vcursor the
vertical offset, and hpadded_count stores how many 8x8 blocks
we have per row, padded to a multiple of the sampling factor.
This isn't quite right if hcursor isn't divisible by both
the vertical and horizontal sampling factor. Tweak things so
that they work.
Also rename `i` to `number_of_mcus_decoded_so_far` since that
what it is, at least now.
For the test case, I converted an existing image to a ppm:
Build/lagom/bin/image -o out.ppm \
Tests/LibGfx/test-inputs/jpg/12-bit.jpg
Then I resized it to 102x77px in Photoshop and saved it again.
Then I turned it into a jpeg like so:
path/to/cjpeg \
-outfile Tests/LibGfx/test-inputs/jpg/odd-restart.jpg \
-sample 2x2,1x1,1x1 -quality 5 -restart 3B out.ppm
The trick here is to:
a) Pick a size that's not divisible by the data size width (8),
and that when rounded to a block size (13) still isn't divisible
by the subsample factor -- done by picking a width of 102.
b) Pick a huffman table that doesn't happen to contain the bit
pattern for a restart marker, so that reading a restart marker
from the bitstream as data causes a failure (-quality 5 happens
to do this)
c) Pick a restart interval where we fail to skip it if our calculation
is off (-restart 3B)
Together with #22987, fixes#22780.
Non-interleaved files always have an MCU of one data unit.
(A "data unit" is an 8x8 tile of pixels, and an "MCU" is a
"minium coded unit", e.g. 2x2 data units for luminance and
1 data unit each for Cr and Cb for a YCrCb image with
4:2:0 subsampling.)
For the test case, I converted an existing image to a ppm:
Build/lagom/bin/image -o out.ppm \
Tests/LibGfx/test-inputs/jpg/12-bit.jpg
Then I converted it to grayscale and saved it as a pgm in Photoshop.
Then I turned it into a weird jpeg like so:
path/to/cjpeg \
-outfile Tests/LibGfx/test-inputs/jpg/grayscale_mcu.jpg \
-sample 2x2 -restart 3 out.pgm
Makes 3 of the 5 jpegs failing to decode at #22780 go.
When present, the alpha channel is also affected by the horizontal
differencing predictor.
The test case was generated with GIMP with the following steps:
- Open an RGB image
- Add a transparency layer
- Export as TIFF with the LZW compression scheme
This tag is required by the specification, but some encoders (at least
Krita) don't write it for images with a single strip.
The test file was generated by opening deflate.tiff in Krita and saving
it with the DEFLATE compression.
Type 2 <=> One-dimensional Group3, customized for TIFF
Type 3 <=> Two-dimensional Group3, uses the original 1D internally
Type 4 <=> Two-dimensional Group4
So let's clarify that this is not Group3 1D but the TIFF variant, which
is called `CCITTRLE` in libtiff. So let's stick with this name to avoid
confusion.
Images with a display mask ("stencil" as it's called in DPaint) add
an extra bitplane which acts as a mask. For now, at least skip it
properly. Later we should render masked pixels as transparent, but
this requires some refactoring.
We now allow all subsampling factors where the subsampling factors
of follow-on components evenly decode the ones of the first component.
In practice, this allows YCCK 2111, CMYK 2112, and CMYK 2111.
Some apps seem to generate malformed images that are accepted
by most readers. We now only throw if malformed data would lead to
a write outside the chunky buffer.
When using the BMP encoding, ICO images are expected to contain a 1-bit
mask for transparency. Regardless an alpha channel is already included
in the image, the mask is always required. As stated here[1], the
mask is used to provide shadow around the image.
Unfortunately, it seems that some encoder do not include that second
transparency mask. So let's read that mask only if some data is still
remaining after decoding the image.
The test case has been generated by truncating the 64 last bytes
(originally dedicated to the mask) from the `serenity.ico` file and
changing the declared size of the image in the ICO header. The size
value is stored at the offset 0x0E in the file and I changed the value
from 0x0468 to 0x0428.
[1]: https://devblogs.microsoft.com/oldnewthing/20101021-00/?p=12483
This fixes an issue where GIF images without a global color table would
have the first segment incorrectly interpreted as color table data.
Makes many more screenshots appear on https://virtuallyfun.com/ :^)
TIFF files are made in a way that make them easily extendable and over
the years people have made sure to exploit that. In other words, it's
easy to find images with non-standard tags. Instead of returning an
error for that, let's skip them.
Note that we need to make sure to realign the reading head in the file.
The test case was originally a 10x10 checkerboard image with required
tags, and also the `DocumentName` tag. Then, I modified this tag in a
hexadecimal editor and replaced its id with 30 000 (0x3075 as a LE u16)
and the type with the same value as well. This is AFAIK, never used as
a custom TIFF tag, so this should remain an invalid tag id and type.
We currently assume that the K (black) channel uses the same sampling
as the Y channel already, so this already works as long as we don't
error out on it.