For example the words "can't" and "32.3" should not have boundaries
detected on the "'" and "." code points, respectively.
The String test cases fixed here are because "b'ar" is now considered
one word.
Similar to commit 6d710eeb43. Rather than
pick-and-chosing what to support, let's just support all encodings now,
as it is trivial. For example, LibGUI will want the UTF-32 overloads.