Some of these are less helpful than others. Avoiding a bunch of mallocs in the event loop wakeup code is definitely nice.
I'm still feeling this out, but I am starting to like the general idea.
The algorithm I came up with is O(n^2) but given the small numbers of rects we're typically working with, it doesn't really matter. May need to revisit this in the future if we find ourselves with a huge number of rects.