Show HN: TG – Fast geometry library for C (opens in new tab)

(github.com)

201 pointstidwall2y ago32 comments

32 comments

31 comments · 11 top-level

Waterluvian2y ago· 8 in thread

Very nice! I wonder if I can compile this to wasm. I’ve been badly needing spatial predicates in a web environment that are faster and less clunky than JSTS.

Does it assume a flat Cartesian world or does it handle ellipsoids or even map projections? (Or does it avoid the complexity altogether by not doing any work that cares about distances?)

crabmusket2y ago

Are there any other libraries in the browser space that implement polygon Boolean operations other than JSTS?

I've tried martinez-polygon-clipping, polygon-clipping, and polyclip-ts, all of which are based on the same algorithm, and all of which seem to throw exceptions frequently on geometry with overlapping edges.

I'm thinking I'll need to dig into JSTS in the hope it'll prove more stable... or learn how to write these kinds of algorithms myself.

kylebarron2y ago

I'm working on bringing GeoRust algorithms [0] (which include boolean operations) to the web via WebAssembly. See this blog post [1] for an intro. There's also separate work on binding GEOS to Wasm [2], but I'm more excited about GeoArrow in the long term because you can interpret Wasm geometries in JS without any copies across the boundary [3].

[0]: https://docs.rs/geo/latest/geo/#algorithms

[1]: https://observablehq.com/@kylebarron/prototyping-georust-geo...

[2]: https://github.com/chrispahm/geos-wasm

[3]: https://observablehq.com/@kylebarron/zero-copy-apache-arrow-...

crabmusket2y ago

Oh I've read [3] before, that's very exciting! I'll be following your RSS feed :)

Tanjreeve2y ago

I would assume it's flat

A) if you're handling projections or doing geodesic geometry then there's a bunch of other stuff to do and you'd probably have to mention it.

B) it would be obvious in the codebase if that was a part of it there'd be mentions of projections or some sort of constants for spherical maths.

C) Projections etc are often very application specific afaik it's not really possible to generalise it safely.

o11c2y ago

Ellipsoids don't have a well-defined answer to what is "inside" or "outside" the asylum.

simonw2y ago

Yeah a WASM build of this would be fantastic. It looks like it's a very clean C codebase so I expect getting it working in WASM would be relatively straightforward.

tidwallOP2y ago

Yes, I used wasm as one of the targets during development.

simonw2y ago

Awesome, you should add that to the readme!

I blogged about it here: https://simonwillison.net/2023/Sep/23/tg-polygon-indexing/

aquarin2y ago· 5 in thread

I am curious, what exactly "feq" is calculating?

> return !((x < y) | (x > y));

Is it supposed to avoid x == y. If yes, why?

aquarin2y ago

Also, "length" may have better be implemented using: https://en.cppreference.com/w/c/numeric/math/hypot

It will be helpful, if you add comments discussing why certain functions are implemented in that way, and what algorithms (references) is used. For example, there are some magic numbers without explanation: if (cap < 1000) {

Have you looked at: https://github.com/davideberly/GeometricTools Each algorithm is well documented and explained. There is even a short pdf papers for some of the algorithms explainimg why some decisions are taken and what numerical issues may be expected.

WithinReason2y ago

It implements x==y, but it's also true when one of them is a NaN

aquarin2y ago

I understand. You want to work with NaN-s for some reason.

aquarin2y ago

Is the "Natural structure" described in some paper?

1 more reply

MobiusHorizons2y ago

I think this might handle NaN differently than == does, I believe the comparisons would always be false with NaN. I’m not sure if that is the intent though.

s2mcallis2y ago· 2 in thread

Howdy, I work on S2 [1] so I have questions! How do you deal with polygons that cross the antimeridian?

The indexing structure you've come up with seems very interesting. In spherical coordinates line sweep algorithms like that are a little less intuitive because there's not really a min and max y value to work with. Does your index support multiple polygons indexed together?

The lack of exact predicates worries me a little bit. It's tricky because it will work right until it doesn't for mysterious reasons, and it's very hard to test for if you haven't built it on a foundation of exact predicates. You'll periodically fall into the fractal foam around edges when testing if you cross them or not ([2] has some good pictures). We do this in S2 by relying on a set of predicates [3] that fall all the way back to symbolic perturbation if they have to. We simply don't have to think about colinear points in S2 for that reason.

[1] https://s2geometry.io/ [2] https://github.com/mourner/robust-predicates [3] https://github.com/google/s2geometry/blob/master/src/s2/s2pr...

tidwallOP2y ago

TG isn't spherical and wasn't designed to be. It's 2D and projection agnostic. Crossing the antimeridian is a user problem to solve. I recommend following the GeoJSON rule of antimeridian cutting [1].

As I said in the README. My goals are fast point-in-polygon, geometry intersections, and low memory footprint. Those are my bread-and-butter.

From what I know about S2, it has different goals. Such as being spherical and using discrete cells for distributed spatial indexes. Those are great things of course, but not really what I need.

This is the second comment that recommends using Shewchuk’s methods. And while yes I agree that that may be a superior way to go, and always on the table for the future, I'm still able to achieve my goals without those methods. At least for now.

[1] https://datatracker.ietf.org/doc/html/rfc7946#section-3.1.9

s2mcallis2y ago

Yeah we're spherical mostly because working in spherical coordinates let's you not have to worry about the projection. S2Cells are really just nodes in a quadtree, the difference being our quad tree is implicit, you can (and we do) store them in a linear array in memory.

ginko2y ago· 2 in thread

Rather than creating new geometric primitives on the heap, wouldn't it be more flexible if the caller can provide the memory to initialize the structure in?

Then instead of

  struct tg_geom *geom = tg_geom_new_point(-112, 33);
  if (!geom) {
      // System is out of memory.
  }

you could do

  struct tg_geom geom;
  tg_geom_init_point(&geom, -112, 33);

without need for error checking or cleanup. You can still allocate on the heap if you want but you wouldn't have to.

tidwallOP2y ago

I have strict safety rules for this project, one of which requires that all geometries are thread-safe pointers. So in the example you show, it would not be safe to share the tg_geom pointer with other threads because it belongs on the stack instead of the heap.

Yet I do see the need for avoiding allocations when working with lots of simple geometries like points and rects, and for my use I made functions like tg_geom_intersects_xy and tg_geom_intersects_rect to perform relation operations directly on point and rectangle values.

matthews22y ago

I think that this would be incompatible with the fast cloning/reference counting feature.

pasabagi2y ago· 2 in thread

Hey, this looks really great, but I couldn't really see anything in the docs about robustness (how you handle floating point inaccuracy, etc) - what approach did you use?

tidwallOP2y ago

I can't stop precision loss in all cases, but I do my darnedest to avoid loss when it causes false positives, especially for stuff like intersect detection code. For example the collinear [1] function looks really big for a seemingly simple operation, but there are extra checks built in for precision loss and in the cases of compiler associate math issues (like a user borking a build with -ffast-math).

I'm sure it's not all perfect but I feel pretty good about it overall. It certainly helps that much of the logic derived from the Tile38 [2] project, which has 8 years of use in production. I ported many of the tests too, which make me warm and fuzzy every time they pass.

[1] https://github.com/tidwall/tg/blob/v0.1.0/tg.c#L389

[2] https://github.com/tidwall/tile38

fastneutron2y ago

You might consider Shewchuk’s methods for robust geometric computations: https://www.cs.cmu.edu/~quake/robust.html

There are more modern implementations available on GitHub, but I’ve found it to be fairly easy to integrate into C-family projects without much modification.

simonw2y ago· 1 in thread

This looks really interesting. It implements pretty much the exact subset of geospatial stuff that I care about, in a single C file (an amalgamation that includes its dependencies).

This could make for a really neat SQLite extension, as a much lighter alternative to SpatiaLite.

simonw2y ago

... and here's Alex Garcia's first version of that SQLite extension: https://github.com/asg017/sqlite-tg

hgs32y ago

Nice library. +1 for the excellent visualizations [1]. Will you keep it focused on intersections or might you dabble with triangulation?

[1] https://github.com/tidwall/tg/blob/main/docs/POLYGON_INDEXIN...

simonw2y ago

This is a really great README - clearly explained the benefits of the library and anticipated all of my questions about it.

unnah2y ago

It looks like the coordinates are 64-bit doubles. How accurate and repeatable are the implementations of geometry operations here?

MoSarwatSpatial2y ago

This is awesome! I wonder how feasible is it to include TG in Apache Sedona (https://github.com/apache/sedona)

Although Sedona runs as a distributed system, but TG may speed local in-memory geometrical computation for each worker node. Let me know your thoughts!

sylware2y ago

The author should consider some assembly versions of parts of it (ofc, without abusing the assembler pre-processor).

j / k navigate · click thread line to collapse

32 comments

31 comments · 11 top-level

Waterluvian2y ago· 8 in thread

Very nice! I wonder if I can compile this to wasm. I’ve been badly needing spatial predicates in a web environment that are faster and less clunky than JSTS.

Does it assume a flat Cartesian world or does it handle ellipsoids or even map projections? (Or does it avoid the complexity altogether by not doing any work that cares about distances?)

crabmusket2y ago

Are there any other libraries in the browser space that implement polygon Boolean operations other than JSTS?

I'm thinking I'll need to dig into JSTS in the hope it'll prove more stable... or learn how to write these kinds of algorithms myself.

kylebarron2y ago

[0]: https://docs.rs/geo/latest/geo/#algorithms

[1]: https://observablehq.com/@kylebarron/prototyping-georust-geo...

[2]: https://github.com/chrispahm/geos-wasm

[3]: https://observablehq.com/@kylebarron/zero-copy-apache-arrow-...

crabmusket2y ago

Oh I've read [3] before, that's very exciting! I'll be following your RSS feed :)

Tanjreeve2y ago

I would assume it's flat

A) if you're handling projections or doing geodesic geometry then there's a bunch of other stuff to do and you'd probably have to mention it.

B) it would be obvious in the codebase if that was a part of it there'd be mentions of projections or some sort of constants for spherical maths.

C) Projections etc are often very application specific afaik it's not really possible to generalise it safely.

o11c2y ago

Ellipsoids don't have a well-defined answer to what is "inside" or "outside" the asylum.

simonw2y ago

Yeah a WASM build of this would be fantastic. It looks like it's a very clean C codebase so I expect getting it working in WASM would be relatively straightforward.

tidwallOP2y ago

Yes, I used wasm as one of the targets during development.

simonw2y ago

Awesome, you should add that to the readme!

I blogged about it here: https://simonwillison.net/2023/Sep/23/tg-polygon-indexing/

aquarin2y ago· 5 in thread

I am curious, what exactly "feq" is calculating?

> return !((x < y) | (x > y));

Is it supposed to avoid x == y. If yes, why?

aquarin2y ago

Also, "length" may have better be implemented using: https://en.cppreference.com/w/c/numeric/math/hypot

WithinReason2y ago

It implements x==y, but it's also true when one of them is a NaN

aquarin2y ago

I understand. You want to work with NaN-s for some reason.

aquarin2y ago

Is the "Natural structure" described in some paper?

1 more reply

MobiusHorizons2y ago

I think this might handle NaN differently than == does, I believe the comparisons would always be false with NaN. I’m not sure if that is the intent though.

s2mcallis2y ago· 2 in thread

Howdy, I work on S2 [1] so I have questions! How do you deal with polygons that cross the antimeridian?

[1] https://s2geometry.io/ [2] https://github.com/mourner/robust-predicates [3] https://github.com/google/s2geometry/blob/master/src/s2/s2pr...

tidwallOP2y ago

As I said in the README. My goals are fast point-in-polygon, geometry intersections, and low memory footprint. Those are my bread-and-butter.

From what I know about S2, it has different goals. Such as being spherical and using discrete cells for distributed spatial indexes. Those are great things of course, but not really what I need.

[1] https://datatracker.ietf.org/doc/html/rfc7946#section-3.1.9

s2mcallis2y ago

ginko2y ago· 2 in thread

Rather than creating new geometric primitives on the heap, wouldn't it be more flexible if the caller can provide the memory to initialize the structure in?

Then instead of

  struct tg_geom *geom = tg_geom_new_point(-112, 33);
  if (!geom) {
      // System is out of memory.
  }

you could do

  struct tg_geom geom;
  tg_geom_init_point(&geom, -112, 33);

without need for error checking or cleanup. You can still allocate on the heap if you want but you wouldn't have to.

tidwallOP2y ago

matthews22y ago

I think that this would be incompatible with the fast cloning/reference counting feature.

pasabagi2y ago· 2 in thread

Hey, this looks really great, but I couldn't really see anything in the docs about robustness (how you handle floating point inaccuracy, etc) - what approach did you use?

tidwallOP2y ago

[1] https://github.com/tidwall/tg/blob/v0.1.0/tg.c#L389

[2] https://github.com/tidwall/tile38

fastneutron2y ago

You might consider Shewchuk’s methods for robust geometric computations: https://www.cs.cmu.edu/~quake/robust.html

There are more modern implementations available on GitHub, but I’ve found it to be fairly easy to integrate into C-family projects without much modification.

simonw2y ago· 1 in thread

This looks really interesting. It implements pretty much the exact subset of geospatial stuff that I care about, in a single C file (an amalgamation that includes its dependencies).

This could make for a really neat SQLite extension, as a much lighter alternative to SpatiaLite.

simonw2y ago

... and here's Alex Garcia's first version of that SQLite extension: https://github.com/asg017/sqlite-tg

hgs32y ago

Nice library. +1 for the excellent visualizations [1]. Will you keep it focused on intersections or might you dabble with triangulation?

[1] https://github.com/tidwall/tg/blob/main/docs/POLYGON_INDEXIN...

simonw2y ago

This is a really great README - clearly explained the benefits of the library and anticipated all of my questions about it.

unnah2y ago

It looks like the coordinates are 64-bit doubles. How accurate and repeatable are the implementations of geometry operations here?

MoSarwatSpatial2y ago

This is awesome! I wonder how feasible is it to include TG in Apache Sedona (https://github.com/apache/sedona)

Although Sedona runs as a distributed system, but TG may speed local in-memory geometrical computation for each worker node. Let me know your thoughts!

sylware2y ago

The author should consider some assembly versions of parts of it (ofc, without abusing the assembler pre-processor).

j / k navigate · click thread line to collapse