undefined | Better HN

0 pointsturtledragonfly3y ago0 comments

Thanks for the link — no, I had not heard of it.

Though I must say, looking at it briefly, I see no direct mention of "linear quadtree" (or even just "quadtree"), nor morton codes, z-order curves, BIGMIN/LITMAX functions, etc.

I suppose that's another difficulty with quadtrees — they're so common and amenable to hand-rolling that there are lots of different variations on terminology.

Do you happen to know where the linear-quadtree-related code is, in that library?

0 comments

5 comments · 2 top-level

yencabulator3y ago· 3 in thread

S2 uses Hilbert curves, which is slightly different but definitely the same idea. https://s2geometry.io/devguide/s2cell_hierarchy

A cell ID in the S2 coordinate system is equivalent to he path to a quadtree node, as in 0101101 describes which sub-sub-sub-etc-quadrant you're talking about.

It seems like linear quadtree in this context would essentially be an array sorted by S2 cell ids? (I'm not entirely clear is linear quadtrees are just the old trick of array-ifying a fixed-fanout tree structure, where k's children are 2k+1, 2k+2, etc.)

turtledragonflyOP3y ago

A linear quadtree is (most commonly, in my experience) just a list of Morton codes, sorted. So yes, very similar to sorted S2 cell Ids, it would seem.

One way to efficiently do a spatial query on such a sorted list is the BIGMIN/LITMAX algorithm I mentioned earlier. I believe the originating paper is: http://hermanntropf.de/media/multidimensionalrangequery.pdf (see section 4 — "Range Search").

In my particular case, I also only cared about the "black" nodes in a tree (that is, ones with some data inside), so the linear quadtree is also sparse, speeding it up more. I think since S2 is modeling the entire sphere's surface, it is fully dense. A sparse version might be "only include the cells that contain land," if that were all that was needed for the application.

Though in the sparse version, you don't have fixed-fanout (or at least, it's not embodied in the array), so can't use the trick you mention. Pros and cons (:

One key advantage of Morton codes is that, given a cell ID, you can split it into the "x bits" and "y bits", to get a genuine (x,y) coordinate in the mapped space. Then you can, for instance, easily navigate to your (spatial) neighbor by incrementing X or Y. If I understand your last sentence correctly, this is not something you can do so easily with a plain array-ifying trick.

yencabulator3y ago

> I think since S2 is modeling the entire sphere's surface, it is fully dense. A sparse version might be "only include the cells that contain land," if that were all that was needed for the application.

I mean, S2 can express any point on the surface, but you don't have to populate any such data structure for all the points, and normally wouldn't. Generally S2 containers are sparse, and not using any custom geo-container code.

Background: S2 cell ID are compact 64 bits, but still contain the size of the cell (roughly, "more trailing zeroes mean we're talking about larger and larger cells"). That is, you can cover "continental US" with just a few cell IDs.

(I'm not a Googler but) The way S2 is normally used, cell IDs are just keys in an ordered key-value store, and if you want to ask for example "what country contains point P", you store cellId->country mapping with as large cells as possible. Then you'd look up the key closest to P in the store. Cell ID 101101... is contained inside cell ID 10110... is contained inside 1011... etc, so all you need to do is find the node with the max cell ID <= the point you are looking for.

B+tree etc are all great for it. For an in-memory array, a binary search would work. No need for fancy custom data structures, just general key-value storage where you can find the key near input P (that is, ordered, not hashmap).

gniv3y ago

As yencabulator said, you can do point queries quite easily when indexing using s2 cells. But you can also do range queries efficiently, you just have to be a bit more careful with creating the query cells.

gniv3y ago

It's the S2 cell ids (S2CellId in the library). See also this slide deck: https://docs.google.com/presentation/d/1Hl4KapfAENAOf4gv-pSn...

j / k navigate · click thread line to collapse

0 comments

5 comments · 2 top-level

yencabulator3y ago· 3 in thread

S2 uses Hilbert curves, which is slightly different but definitely the same idea. https://s2geometry.io/devguide/s2cell_hierarchy

A cell ID in the S2 coordinate system is equivalent to he path to a quadtree node, as in 0101101 describes which sub-sub-sub-etc-quadrant you're talking about.

turtledragonflyOP3y ago

A linear quadtree is (most commonly, in my experience) just a list of Morton codes, sorted. So yes, very similar to sorted S2 cell Ids, it would seem.

Though in the sparse version, you don't have fixed-fanout (or at least, it's not embodied in the array), so can't use the trick you mention. Pros and cons (:

yencabulator3y ago

gniv3y ago

It's the S2 cell ids (S2CellId in the library). See also this slide deck: https://docs.google.com/presentation/d/1Hl4KapfAENAOf4gv-pSn...

j / k navigate · click thread line to collapse