Commit graph

3218 commits

Author SHA1 Message Date
Andreas Kling
1de29e3f59 LibWeb: Implement the "self closing start tag" tokenizer state 2020-05-27 18:30:29 +02:00
Andreas Kling
a5ce09f8e3 LibWeb: Implement partial support for numeric character references 2020-05-27 18:30:27 +02:00
Matthew Olsson
dd08c992e8 LibJS: Simplify and normalize publicly-exposed Object functions
Previously, the Object class had many different types of functions for
each action. For example: get_by_index, get(PropertyName),
get(FlyString). This is a bit verbose, so these methods have been
shortened to simply use the PropertyName structure. The methods then
internally call _by_index if necessary. Note that the _by_index
have been made private to enforce this change.

Secondly, a clear distinction has been made between "putting" and
"defining" an object property. "Putting" should mean modifying a
(potentially) already existing property. This is akin to doing "a.b =
'foo'".

This implies two things about put operations:
    - They will search the prototype chain for setters and call them, if
      necessary.
    - If no property exists with a particular key, the put operation
      should create a new property with the default attributes
      (configurable, writable, and enumerable).

In contrast, "defining" a property should completely overwrite any
existing value without calling setters (if that property is
configurable, of course).

Thus, all of the many JS objects have had any "put" calls changed to
"define_property" calls. Additionally, "put_native_function" and
"put_native_property" have had their "put" replaced with "define".

Finally, "put_own_property" has been made private, as all necessary
functionality should be exposed with the put and define_property
methods.
2020-05-27 13:17:35 +02:00
Sergey Bugaev
fce49b3e32 LibGUI: Change GUI::KeyEvent::key() type to KeyCode
...instead of a plain int. Yay for some type safety.
2020-05-27 11:19:38 +02:00
AnotherTest
790915da54 LibWeb: Provide some properties to inspectors of ResourceLoader 2020-05-27 11:13:02 +02:00
TheDumpap
c700a30ce8 LibWeb: Handle additional parser inputs in "initial" and "before html". 2020-05-27 11:10:54 +02:00
Emanuele Torre
8d8c33833f LibWeb: s_initialized should be static in the AttributeNames initialiser 2020-05-27 09:57:38 +02:00
Andreas Kling
4ec8b9f6ee LibWeb: Use FlyString in FontCache keys 2020-05-26 23:45:48 +02:00
Andreas Kling
82444048de LibWeb: Add cached global attribute name FlyStrings
Instead of creating extremely common FlyStrings like "id" and "class"
on demand every time they are needed, we now have AttributeNames.h,
which provides Web::HTML::AttributeNames::{id,class_}

This avoids a bunch of string allocations during selector matching.
2020-05-26 23:45:43 +02:00
Andreas Kling
5069d380a8 LibWeb: Let Element cache its list of classes
Instead of string splitting every time you call Element::has_class(),
we now split the "class" attribute value when it changes, and cache
the individual classes as FlyStrings in Element::m_classes.

This makes has_class() significantly faster and moves the pain point
of selector matching somewhere else.
2020-05-26 23:07:19 +02:00
Andreas Kling
7ed80ae96c LibWeb: Make the CSS parser a little more tolerant to invalid CSS
Sometimes people put a '}' where it doesn't belong, or various other
things go wrong. 99% of the time, it's our fault, but either way,
this patch makes us not crash or infinite-loop in some common cases.

The real solution here is to write a proper CSS lexer-parser according
to the language spec, this is just a hack fix to make more sites load
at all.
2020-05-26 22:31:22 +02:00
Linus Groh
72c52466e0 LibWeb: Add more HTML entities
®, ß and all the lowercase and uppercase umlaut characters.
2020-05-26 22:23:09 +02:00
Andreas Kling
f01af62313 LibWeb: Basic support for display:inline-block with width:auto
We now implement the somewhat fuzzy shrink-to-fit algorithm when laying
out inline-block elements with both block and inline children.

Shrink-to-fit works by doing two speculative layouts of the entire
subtree inside the current block, to compute two things:

1. Preferred minimum width: If we made a line break at every chance we
   had, how wide would the widest line be?
2. Preferred width: We break only when explicitly told to (e.g "<br>")
   How wide would the widest line be?

We then shrink the width of the inline-block element to an appropriate
value based on the above, taking the available width in the containing
block into consideration (sans all the box model fluff.)

To make the speculative layouts possible, plumb a LayoutMode enum
throughout the layout system since it needs to be respected in various
places.

Note that this is quite hackish and I'm sure there are smarter ways to
do a lot of this. But it does kinda work! :^)
2020-05-26 22:02:27 +02:00
FalseHonesty
4e8bcda4d1 LibWeb: Add HTML copyright escape 2020-05-26 22:02:17 +02:00
Kevin Meyer
b85ab86c84 LibWeb: Fix step within reconstruct the active elements
In step 4 of the "renstruct the active formatting elements" algorithm it
says:
  Rewind: If there are no entries before entry in the list of active
  formatting elements, then jump to the step labeled create.

Prior to this patch, the implementation accorded to the spec only for
the first loop iteration.
2020-05-26 21:52:46 +02:00
Andreas Kling
4a9deddb4a LibWeb: The line-height should not be multiplied by the glyph height
This was causing very tall lines on many websites. We can now see the
section header thingy on google.com (although it's broken into lines
where it should not be..) :^)
2020-05-26 21:09:32 +02:00
Andreas Kling
7bb69bb9bf LibWeb: Implement immediate execution in HTMLScriptElement preparation
In some cases, Dr. HTML says we should execute the script right away
even if other scripts are running.
2020-05-26 15:55:18 +02:00
Andreas Kling
ecd25ce6c7 LibWeb: Allow HTML tokenizer to emit more than one token
Tokens are now put on a queue when emitted, and we always pop from that
queue when returning from next_token().
2020-05-26 15:50:05 +02:00
Sergey Bugaev
602c3fdb3a AK: Rename FileSystemPath -> LexicalPath
And move canonicalized_path() to a static method on LexicalPath.

This is to make it clear that FileSystemPath/canonicalized_path() only
perform *lexical* canonicalization.
2020-05-26 14:35:10 +02:00
Andreas Kling
8ff4ebb589 LibWeb: Add Element.getAttribute() and Element.setAttribute() :^) 2020-05-26 12:27:10 +02:00
FalseHonesty
b352a6b59d LibWeb: Implement vendor specific CSS color style for System Palette
Add "-libweb-palette-foo-bar" CSS color properties to allow CSS to
style itself using the currently selected System Theme.
2020-05-26 10:17:50 +02:00
Linus Groh
67b742bf32 LibWeb: Add document.querySelector() 2020-05-26 00:12:20 +02:00
Andreas Kling
1e30ef239b LibWeb: Start fleshing out the "in table" parser insertion mode 2020-05-25 20:30:34 +02:00
Andreas Kling
f62a8d3b19 LibWeb: Handle some more parser inputs in the "in head" insertion mode 2020-05-25 20:16:48 +02:00
Andreas Kling
50265858ab LibWeb: Add a PARSE_ERROR() macro to the new HTML parser
Unless otherwise stated, we shouldn't stop parsing just because there's
a parse error, so let's allow ourselves to continue.

With this change, we can now tokenize and parse the ACID1 test. :^)
2020-05-25 20:02:27 +02:00
Andreas Kling
406fd95f32 LibWeb: Flesh out the remaining DOCTYPE related tokenizer states
We can now parse public and system identifiers! Not super useful, but
at least we can do it :^)
2020-05-25 19:51:23 +02:00
Andreas Kling
556a6eea61 LibWeb: Checking for "DOCTYPE" should be case insensitive in tokenizer 2020-05-25 19:51:23 +02:00
Andreas Kling
1df2a3d8ce LibWeb: Use String::is_one_of() a bunch in the HTML parser 2020-05-25 19:51:23 +02:00
Linus Groh
fd7cbb5389 LibWeb: Add navigator.language and navigator.languages
Hardcoded to "en-US" and ["en-US"] respectively.
2020-05-25 15:15:31 +02:00
Andreas Kling
21b1aba03b LibWeb: Add missing copyright header 2020-05-25 00:25:33 +02:00
Andreas Kling
4cbe202d2c LibWeb: Finally parse enough that we can actually handle welcome.html!
We made it, at last! What a long journey this was. :^)
2020-05-24 23:54:22 +02:00
Andreas Kling
65d8d5e83e LibWeb: Yet more work towards parsing www/welcome.html :^) 2020-05-24 23:54:22 +02:00
Andreas Kling
45da08a1e6 LibWeb: A whole bunch of work towards spec-compliant <script> elements
This is still very unfinished, but there's at least a skeleton of code.
2020-05-24 23:54:22 +02:00
Andreas Kling
3a30180e1e LibWeb: Add HTMLScriptElement to the forwarding header 2020-05-24 23:54:22 +02:00
Andreas Kling
128eaf9295 LibWeb: Add some helpers to the DOM Node class
This patch adds the following things needed by the HTML spec:

- Node::child_text_content()
- Node::is_connected()
- Node::root()
2020-05-24 23:54:22 +02:00
Andreas Kling
6c409310a8 LibWeb: Add a way to opt out of TreeNode::append_child() notifications
This will be used temporarily by the new HTML parser while we're
bringing it up.
2020-05-24 23:54:22 +02:00
Andreas Kling
5d332c1f11 LibWeb: Parse enough to handle a <style> inside a <head> :^) 2020-05-24 23:54:22 +02:00
Andreas Kling
af8a9331b2 LibWeb: Support comments in the "in head" insertion mode 2020-05-24 23:54:22 +02:00
Andreas Kling
20911efd4d LibWeb: More work on the HTML parser and tokenizer
The parser can now switch the state of the tokenizer! Very webby. :^)
2020-05-24 23:54:22 +02:00
Andreas Kling
31db3f21ae LibWeb: Start implementing character token parsing
Now that we've gotten rid of the misguided character buffering in the
tokenizer, it actually spits out character tokens that we have to deal
with in the parser.

This patch implements enough to bring us back to speed with simple.html
2020-05-24 23:54:22 +02:00
Andreas Kling
53d2f4df70 LibWeb: Factor out the "stack of open elements" into its own class
This will allow us to write more expressive parsing code. :^)
2020-05-24 23:54:22 +02:00
Andreas Kling
96cc1138c0 LibWeb: Remove tokenizer's premature character buffering optimization 2020-05-24 23:54:22 +02:00
Daniel Gustafsson
6561987e9f
LibWeb: Fix copy-paste error in HTMLDocumentParser (#2358)
When watching the video of the new HTML parser I noticed a small copy
and paste error. In one of the cases in `handle_after_head` the code
was checking for end tags when it should check for start tags.

I haven't tested this change, just looking at the spec.
2020-05-24 13:48:46 +02:00
Jack Byrne
58480a510f
LibWeb: Improve support for white-space CSS property (#2348)
Add reasonable support for all values of white-space CSS property.

Values of the property are translated into a 3-tuple of rules:

    do_collapse:        whether whitespace is to be collapsed
    do_wrap_lines:      whether to wrap on word boundaries when
                        lines get too long
    do_wrap_breaks:     whether to wrap on linebreaks

The previously separate handling of per-line splitting and per-word
splitting have been unified. The Word structure is now a more
general Chunk, which represents different amounts of text depending
on splitting rules.
2020-05-24 09:49:02 +02:00
Emanuele Torre
3f2158bbfe LibWeb: HtmlTokenizer.cpp: fix ON_WHITESPACE macro
The "audible bell" character ('\a' U+0007) was treated as whitespace
while the "line feed" character ('\n' U+000a) was not.

'\a' is no longer considered whitespace.
'\n' is now considered whitespace.
2020-05-24 09:47:28 +02:00
FalseHonesty
51e79a2bbc LibWeb: Add hook to HtmlView when a new document is set 2020-05-24 02:20:08 +02:00
Andreas Kling
e44c87cfff LibWeb: Implement enough HTML parsing to handle a small simple DOM :^)
We can now parse a little DOM like this:

<!DOCTYPE html>
<html>
    <head></head>
    <body>
        <div></div>
    </body>
</html>

This is pretty slow work, but the incremental progress is satisfying!
2020-05-24 00:49:22 +02:00
Andreas Kling
fd1b31d0ff LibWeb: Start building the tree building part of the new HTML parser
This patch adds a new HTMLDocumentParser class. It keeps a tokenizer
object internally and feeds itself with one token at a time from it.

The names and idioms in this class are expressed as closely to the
actual HTML parsing spec as possible, to make development as easy
and bug free as possible. :^)

This is going to become pretty large, but it's pretty cool!
2020-05-24 00:14:23 +02:00
Andreas Kling
0b61e21873 LibWeb: Add HTMLFormElement to forwarding header 2020-05-24 00:12:27 +02:00
Andreas Kling
a9fba2fb91 LibWeb: Add "name" to DocumentType nodes 2020-05-24 00:12:00 +02:00