It does the things you'd expect from this kind of gateway... semantic routing via a three-layer cascade (keyword heuristics, embedding similarity, LLM classifier) that picks the best model when clients omit the model field, weighted round-robin load balancing across local inference servers with health checks and failover.
The part I think is most interesting is the network layer. The gateway and backends communicate over zrok/OpenZiti overlay networks... reach a GPU box behind NAT, expose the gateway to clients, put components anywhere with internet connectivity behind firewalls... no port forwarding, no VPN. Zero-trust in both directions. Most LLM proxies solve the API translation problem. This one also solves the network problem.
Apache 2.0. https://github.com/openziti/llm-gateway
I work for NetFoundry, which sponsors the OpenZiti project this is built on.