You have to be at pretty massive scale before federation becomes necessary and by then (if ever) your frontend teams have experienced benefits that are pretty much miraculous. The reason frontend wants to fit as much as possible into it is because it's vastly better than what came before it unless you have a 95th percentile org that is really doing an outstanding job managing the API via other means.
Speaking of Netflix - I think they had an alternative Api federation service that used some clever tricks with json string vs number keys to allow for alternating http put/get - and through that leverage http level caching. But I can't find the the link...
Fair point about the DB scaling but not sure if everyone is going to run into this issue. Also, lots of solutions are emerging for this specific problem (with different trade-offs of course) like distributed databases (crunchy, YugaByte, Spanner, etc.). Most folks I work with get by with a reasonably sized DB and some read replicas.
Not a GraphQL problem though IMO.
While you are correct, not everyone is going to end up with this issue, those that are thinking of getting to a medium sized business should be working to avoid it, which unfortunately means solutions such as hasura lose value. It would be good to see more ability to collect data from multiple sources (please reply and correct me if you already do this, I'm not super familiar with the service).
What about the concept doesn't work? It's just a syntax for queries, I'm confused why it wouldn't scale.
The problem there is whi has the right to run which queries?
So the real problem is authorization.
As for GraphQL - it's great for clients. As a backend engineer, you still have to do the work and a LOT of it.
This is just like microservices. No due diligence on whether the added complexity and destroyed productivity is worth it. "Everyone else is doing it".
Maybe you’re too out of touch. This isn’t true, at least not anymore. GraphQL is losing appeal.
If we are building a report or dashboard that we pull up a few times a day then a pull based model where we query the database on page load is fine.
For almost anything else such as an app, a microservice, an alerting system, a web page, a dashboard, we want to be able to update it in near real time for the user experience. Receiving a stream of query results is by far the easiest way to do this.
Polling is obviously a poor interim solution.
I think streaming will be a huge story in data over the next decade. The products are coming through now which is a start.
https://github.com/MaterializeInc/materialize/blob/main/LICE...
I'm generally pretty happy to pay for open source software, but licensing like this is just too risky. I need to be able to experiment with something at scale, in production, before I start paying someone.
A lot of work has taken place in the Kafka, Flink, DataFlow ecosystem but that still leaves a lot of work for the developer over a simple subscribe to query results.
I do think a lot of work has been done, but it all needs to move up a few levels of abstraction.
Personally I find it much easier to write the code explicitly than try to understand what a query planner is doing, especially if performance is relevant. (That's not to say there's not plenty of room for improvement in the streaming world - but I'd rather have a helper library that I can use on top of the low-level API, than have to go through a parser and planner for every query even when I know exactly what I want to do) But I seem to be an outlier in this regard.
Our focus (Hasura) is on the last mile so that innovations on the data side (eg: materialize, ksql, timescale continuous aggregates) are “just obvious” to start building applications and services against.
Plus you can stream db changes from postgres to kafka for those edge cases where you really need it
TLDR: if you're in a startup and thinking of building a distributed system.. DONT. stick with a monolith and spin out services as NEEDED.
Curious what products you’ve seen as well.
So the question is: Why have two copies of your data, two products to learn and monitor and operate, write boilerplate to move data between the DBs, etc.?
A "message queue" comes down to being another index on your table/set of tables ordered by a post-commit sequence number. These are things all SQL DBs have already, it just lacks a bit of exposing/packaging to be as convenient to use as a messaging queue.
[1] https://hasura.io/docs/latest/auth/authorization/basics/
[1] https://postgrest.org/en/stable/auth.html#roles-for-each-web... [2] https://www.graphile.org/postgraphile/security/
Seeing some feedback on GraphQL - Hasura has had support for converting templated GraphQL into RESTish endpoints (with Open API Spec docs if needed). We are planning to do the same for this streaming API as well - does anyone have good examples of existing REST/RESTish endpoints that something similar?
There's also a few NPM packages for auto-generating that allow list from your project (https://www.npmjs.com/search?q=hasura%20allow%20list -- the one I've used before was from `tallerdevs`).
They're all similar flavors of producing realtime results - which take similar, but different, methods to their approach.
My understanding (please feel free to correct me if I'm wrong):
- Supabase Realtime uses WAL.
- Hasura Streaming Subscriptions uses a query which will be append-only (could be a sort-by or also WAL).
- Hasura Live Queries uses interval polling, refetching, and multiplexing.
- Supabase uses Postgres RLS for authorization, while Hasura uses an internal RLS system which composes queries (which allows for features like the multiplexing above).
- All 3 use websockets for their client communication.
Supabase Realtime
https://github.com/supabase/realtime#introduction
https://supabase.com/docs/guides/realtime
Hasura Streaming Subscriptions / Live Queries
https://github.com/hasura/graphql-engine/blob/master/archite...
https://github.com/hasura/graphql-engine/blob/master/archite...
We have 2 interfaces for real-time subscriptions: live queries and streaming. The former has been around for few years now. You can read more here: https://hasura.io/docs/latest/subscriptions/postgres/index/#...
With a streaming subscription this is all taken care of, you simply do something like this: 1) Load the last 50 messages (or whatever number of existing messages suits you) 2) Run your streaming subscription using the most recent of the messages from above as you cursor 3) When you send a message, do nothing and let the subscription handle getting the message you just sent
Also, there is another interface for real-time subscriptions called live queries which might be more appropriate depending on the use-case: https://hasura.io/docs/latest/subscriptions/postgres/index/#... .
With the right indexes and lateral joins/cross applies you can make the aggregate you need when you query instead of when you write.
So as new data is added to the table, hasura automatically pushes it to the client via the Websocket subscription.
No refetch needed.
This works with read replica style scaling or with Postgres flavours that support scaling out easily (eg: Cosmos Postgres, Yugabyte & cockroach coming soon).
[1] https://hasura.io/blog/building-real-time-chat-apps-with-gra... [2] https://eclectic-dragon-25a38c.netlify.app/
Would love to see more use cases coming out of this :)
We used https://github.com/hasura/graphql-bench and a set of scripts to monitor runtime characteristics of Hasura and Postgres, and reconciliation to make sure data was received as expected and in-order.
But would love to see if there's other tools that folks have come across!