Commit graph

23 commits

Author SHA1 Message Date
Nico Weber
af5a7b9a51 LibPDF: Don't crash on encrypted files with streams with filter arrays
Makes it possible to render more than 0 pages of CIPA_DC-003-2020_E.pdf
2023-07-24 09:50:45 -04:00
Nico Weber
f956cd6e6a LibPDF: Fix an off-by-one in computing_a_hash_r6_and_later()
With this, `pdf` can print info for CIPA_DC-003-2020_E.pdf
(from https://cipa.jp/e/std/std-sec.html), as well as all other
files I've tried.

CIPA_DC-003-2020_E.pdf is special because it quits this loop after
exactly 64 interations, at round_number 63.

While here, also update a comment to use the non-spec-comment style
I'm now using elsewhere in the file.
2023-07-21 11:55:20 +02:00
Nico Weber
f26783596d LibPDF: Implement StandardSecurityHandler::crypt for AESV3
With this, AESV3 support is complete and CIPA_DC-007-2021_E.pdf
can be opened :^)

(CIPA_DC-003-2020_E.pdf incorrectly cannot be opened yet. This
is due to a minor bug in computing_a_hash_r6_and_later() that
I'll fix a bit later. But except for this minor bug, all AESV3
files I've found so far seem to work.)
2023-07-21 11:55:20 +02:00
Nico Weber
12e77cba0a LibPDF: Move "7.6.2 General Encryption Algorithm" comment down a bit
The algorithm really only starts a bit later in the function,
so move the comment to there.
2023-07-21 11:55:20 +02:00
Nico Weber
6d0dbaf9d7 LibPDF: Extract aes helper in StandardSecurityHandler::crypt()
No behavior change, pure code move.

We'll use this for AESV3.
2023-07-21 11:55:20 +02:00
Nico Weber
9cbdb334ab LibPDF: Make try_provide_user_password() work for R6+ files
try_provide_user_password() calls compute_encryption_key_r6_and_later()
now. This checks both owner and user passwords. (For pre-R6 files,
owner password checking isn't yet implemented, as far as I can tell.)

With this, CIPA_DC-007-2021_E.pdf (or other AESV3-encrypted files)
successfully compute a file encryption key (...and then hit the
TODO() in StandardSecurityHandler::crypt() for AESV3, but it's
still good progress.)
2023-07-21 11:55:20 +02:00
Nico Weber
0428308420 LibPDF: Implement 7.6.4.3.3 Algorithm 2.A: Retrieve file encryption key
...for handlers of revision 6.

The spec for this algorithm has several quirks:

1. It describes how to authenticate a password as an owner password,
   but it redundantly inlines the description of algorithm 12 instead
   of referring to it. We just call that algorithm here.

2. It does _not_ describe how to authenticate a password as a user
   password before using the password to compute the file encryption
   key using an intermediate user key, despite the latter step that
   computes the file encryption key refers to the password as
   "user password". I added a call to algorithm 11 to check if the
   password is the user password that isn't in the spec. Maybe I'm
   misunderstanding the spec, but this looks like a spec bug to me.

3. It says "using AES-256 in ECB mode with an initialization vector
   of zero". ECB mode has no initialization vector. CBC mode with
   initialization vector of zero for message length 16 is the same
   as ECB mode though, so maybe that's meant? (In addition to the
   spec being a bit wobbly, using EBC in new software isn't
   recommended, but too late for that.)

SASLprep / stringprep still aren't implemented. For ASCII passwords
(including the important empty password), this is good enough.
2023-07-21 11:55:20 +02:00
Nico Weber
f8a3022ca2 LibPDF: Plumb OE, UE, Perms values to StandardSecurityHandler 2023-07-21 11:55:20 +02:00
Nico Weber
57768325cc LibPDF: Implement 7.6.4.4.11 Algorithm 12: Authenticating owner password
...for handlers of revision 6.

Since this adds U to the hash input, also trim the size of U and O to
48 bytes. The spec requires them to be 48 bytes, but all the newer PDFs
on https://cipa.jp/e/std/std-sec.html have 127 bytes -- 48 real bytes
and 79 nul padding bytes. These files were created by:

    Creator: Word 用 Acrobat PDFMaker 17
    Producer: Adobe PDF Library 15.0

and

    Creator: Word 用 Acrobat PDFMaker 17
    Producer: Adobe PDF Library 17.11.238
2023-07-21 11:55:20 +02:00
Nico Weber
8f6c67a71c LibPDF: Implement 7.6.4.4.10 Algorithm 11: Authenticating user password
...for handlers of revision 6.
2023-07-21 11:55:20 +02:00
Nico Weber
f23a394aac LibPDF: Stop using MUST in Encryption.cpp
...and use `release_value_but_fixme_should_propagate_errors()` instead,
as requested by mattco98.
2023-07-21 11:55:20 +02:00
Nico Weber
c4bad2186f LibPDF: Implement 7.6.4.3.4 Algorithm 2.B: Computing a hash
This is a step towards AESV3 support for PDF files.

The straight-forward way of writing this with our APIs is pretty
allocation-heavy, but this code won't run all that often for the
regular "open PDF, check password" flow.
2023-07-19 21:26:55 +01:00
Nico Weber
7a48d59727 LibPDF: Simplify AESV2 code a bit
- `encrypt()` will always fill a multiple of block size,
  `decrypt()` might produce less data. But other than that,
  the middle span isn't modified even though it's a reference.
  So pass the ByteBuffer to assign() (kind of like before 5998072f15,
  but pass-by-move())

- In the encryption code path, assign a single buffer for IV and data
  instead of awkwardly copying the data around later.

Thanks to CxByte for suggesting most of this!

No intentional behavior change.
2023-07-18 18:48:57 +02:00
Nico Weber
281e3158c0 LibPDF: Some preparatory work for AESV3
This detects AESV3, and copies over the spec comments explaining what
needs to be done, but doesn't actually do it yet.

AESV3 is technically PDF 2.0-only, but
https://cipa.jp/std/documents/download_e.html?CIPA_DC-007-2021_E has a
1.7 PDF that uses it.

Previously we'd claim that we need a password to decrypt it.
Now, we cleanly crash with a TODO() \o/
2023-07-14 06:34:03 +02:00
Nico Weber
5998072f15 LibPDF: Add support for AESV2 encryption 2023-07-12 06:28:15 +02:00
Nico Weber
92d2895057 LibPDF: Remove a pointless template specialization
We can just have two functions with actual names instead of specializing
on a bool template parameter.

No behavior change.
2023-07-12 06:28:15 +02:00
Nico Weber
826c0426f3 LibPDF: Fix two use-after-frees
Two lambdas were capturing locals that were out of scope by the
time the lambdas ran.

With this, `pdf` can successfully load and print the page count of
pdf_reference_1.7.pdf.
2023-07-10 17:48:15 +01:00
Simon Danner
5fa8068580 LibPDF: Fix calculation of encryption key
Before this patch, the generation of the encryption key was not working
correctly since the lifetime of the underlying data was too short,
same inputs would give random encryption keys.

Fixes #16668
2023-01-04 11:10:37 -05:00
Rodrigo Tobar
bb48a67f84 LibPDF: Reset encryption key on failed user password attempt
When an attempt is made to provide the user password to a
SecurityHandler a user gets back a boolean result indicating success or
failure on the attempt. However, the SecurityHandler is left in a state
where it thinks it has a user password, regardless of the outcome of the
attempt. This confuses the rest of the system, which continues as if the
provided password is correct, resulting in garbled content.

This commit fixes the situation by resetting the internal fields holding
the encryption key (which is used to determine whether a user password
has been successfully provided) in case of a failed attempt.
2022-12-20 10:28:58 +01:00
Rodrigo Tobar
dc6a11cf6b LibPDF: Treat Encyption's Length item as optional
With the StandardSecurityHandler the Length item in the Encryption
dictionary is optional, and needs to be given only if the encryption
algorithm (V) is other than 1; otherwise we can assume a length of 40
bits for the encryption key.
2022-12-20 10:28:58 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Linus Groh
173dcfb7cb Everywhere: Fix a bunch of typos 2022-05-29 15:22:00 +02:00
Matthew Olsson
5b316462b2 LibPDF: Add implementation of the Standard security handler
Security handlers manage encryption and decription of PDF files. The
standard security handler uses RC4/MD5 to perform its crypto (AES as
well, but that is not yet implemented).
2022-03-29 02:52:57 +02:00