A naive GraphQL implementation makes it trivial to fetch giant swaths of your database. That's fine with a 100% trusted client, but if you're using this for a public API or web clients, you can easily be DOSed. Even accidentally!
Shopify's API is a pretty good example of the lengths you have to go to in order to harden a GraphQL API. It's ugly:
https://shopify.dev/concepts/about-apis/rate-limits
You have to limit not just number of calls, but quantity of data fetched. And pagination is gross, with `edges` and `node`. This is is straight from their examples:
{
shop {
id
name
}
products(first: 3) {
edges {
node {
handle
}
}
}
}
Once you fetch a few layers of edges and nodes, queries become practically unreadable.The more rigid fetching behavior of REST & gRPC provides more predictable performance and security behavior.
Of course, if the argument is simply that it tends to be more challenging to manage performance of GraphQL APIs simply because GraphQL APIs tend to offer a lot more functionality than REST APIs, then of course I agree, but that's not a particularly useful observation. Indeed having no API at all would further reduce the challenge!
On their own, such arguments are indeed not useful. But if you can further point out that GraphQL has more functionality than is required, then you can basically make a YAGNI-style argument against GraphQL.
e.g. get all the comments in every article written by one author, I might say `/author/john smith` that returns all their articles, then run an `/articles/{}?include=comments` for each one. That'll run a separate query server-side for each one, which can get very heavy if I'm doing thousands of queries. On the gql this is trivial as `{ author(name: "john smith") { articles { comments `, but because it's one request the server-side fetch can be run _way_ more efficiently. We have dataloaders for the SQL written that'll collapse every big query like this into (often) a `IN (?, ?`... query, or sometimes subselects. Same concept works on any sql or nosql approach. So yeah it might be "a lot" data were it RESTful, but we're not going to bottleneck on a single indexed query and a ~10MB payload.
The real advantage I see for REST in that scenario is that it can _feel_ faster to the end-user, since you'll get some data back earlier. Running a small query on thousands of requests is slower, but you can display the first little one's result to the user faster than a big gql payload,.
It's quite simple (easier in my opinion than in REST) to build a targeted set of GraphQL endpoints that fit end-user needs while being secure and performant. Also, as the other user posted, "edges" and "nodes" has nothing to do with the core GraphQL spec itself.
Of course, the same could happen for standard REST as well, but I think the foot guns are more limited.
> And pagination is gross, with `edges` and `node`
This just reads like an allergic reaction to "the new" and towards change. Edges and Nodes are elegant, less error prone and limits and skips, and most importantly - datasource independent.
In my experience, securing nested assets based on owner/editor/reader/anon was rather difficult and required inspecting the schema stack. I was using the Apollo stack.
This was in the context of apps in projects in accounts (common pattern for SaaS where one email can have permissions in multiple orgs or projects)
One fairly interesting denial of service vector that I've found on nearly every API I've scanned has to do with error messages. Many APIs don't bound the number of error messages that are returned, so you can query for a huge number of fields that aren't in the schema, and then each of those will translate to an error message in the response.
If the server supports fragments, you can also sometimes construct a recursive payload that expands, like the billion laughs attack, into a massive response that can take down the server, or eat up their egress costs.
I like edges and node, it gives you a place to encode information about the relationship between the two objects, if you want to. And if all your endpoints standardize on this Relay pagination, you get standard cursor/offset fetching, along with the option to add relationship metadata in the future if you want, without breaking your schema or clients.
edit: the page you linked to has similar rate limiting behavior for both REST and GraphQL lol
That said, like you I am a fan.
It’s a pretty defensible pattern, more here for those interested: https://andrewingram.net/posts/demystifying-graphql-connecti...
The overall verbosity of a GraphQL queue tends to not be a huge issue either, because in practice individual components are only concerning themselves with small subsets of it (i.e fragments). I’m a firm believer that people will have a better time with GraphQL if they adopt Relay’s bottom-up fragment-oriented pattern, rather than a top-down query-oriented pattern - which you often see in codebases by people who’ve never heard of Relay.
{
products(first: 3) {
pageInfo {
hasNextPage
endCursor
}
edges {
cursor
node {
handle
}
}
}
}For starters, REST and "JSON over HTTP/1.1" are not necessarily synonyms. This description conflates them, when really there are three distinct ways to use JSON over HTTP/1.1: Actual REST (including HATEOAS), the "openAPI style" (still resource-oriented, but without HATEOAS), and JSON-RPC. For most users, the relative merits of these three considerations are going to be a much bigger deal than the question of whether or not to use JSON as the serialization format.
Similarly, for gRPC, you have a few questions: Do you want to do a resource-oriented API that can easily be reverse proxied into a JSON-over-HTTP1.1 API? If so then you gain the ability to access it from Web clients, but may have to limit your use of some of gRPC's most distinctive features. How much do you want to lean toward resource-orientation compared to RPC? gRPC has good support for mixing and matching the two, and making an intentional decision about how you do or do not want to mix them is again probably a much bigger deal in the long run than the simple fact of using protocol buffers over HTTP/2.
GraphQL gives clients a lot of flexibility, and that's great, but it also puts a lot of responsibility on the server. With GraphQL, clients get a lot of latitude to construct queries however they want, and the people constructing them won't have any knowledge about which kinds of querying patterns the server is prepared to handle efficiently. So there's a certain art to making sure you don't accidentally DOS attack yourself. Guarding against this with the other two API styles can be a bit more straightforward, because you can simply not create endpoints that translate into inefficient queries.
"Generates client and server code in your programming language. This can save engineering time from writing service calling code"
It saves around 30% development time on features with lots of API calls. And it grows better since there is a strict contract.
Human readability is over-rated for API's.
I’m not sure how gRPC handles this, but adding an additional field to a SOAP interface meant regenerating code across all the clients else they would fail at runtime while deserializing payloads.
A plus for GraphQL is that because each client request is a custom query, new fields added to the server have no impact on existing client code. Facebook famously said in one of their earlier talks on GraphQL that they didn’t version their API, and have never had a breaking change.
Really, I don’t gRPC and GraphQL should even be compared since they support radically different use cases.
This is basically the reason every field in proto2 will be marked optional and proto3 is “optional” by default. IIRC the spec will just ignore these fields if they aren’t set or if they are present but it doesn’t know how to use them (but won’t delete them, if it needs to be forwarded. Of course this only works if you don’t reserialize it. Edit: this is not true see below).
Also never reuse old/deleted field (number), and be very careful if you change the type (better not).
There's a simple, universal two-step process to render it more-or-less a non-issue. First, enable the introspection API on all servers. This is another spot where I find gRPC beats Swagger at its own game: any server can be made self-describing, for free, with one line of code. Second, use tools that understand gRPC's introspection API. For example, grpcurl is a command-line tool that automatically translates protobuf messages to/from JSON.
So we had to write our own code generator templates. It was a pain compared to GRPC.
But yes, in theory it can be done.
Buckle up, this is going to be a long comparison. Also, disclaimer, this was OpenAPI 2 I was looking at. I don't know what has changed in 3.
gRPC has its own dedicated specification language, and it's vastly superior to OpenAPI's JSON-based format. Being not JSON means you can include comments, which opens up the possibility of using .proto files as a one stop shop for completely documenting your protocols. And the gRPC code generators I played with even automatically incorporate these comments into the docstrings/javadoc/whatever of the generated client libraries, so people developing against gRPC APIs can even get the documentation in their editor's pop-up help.
Speaking of editor support, using its own language means that the IDEs I tried (vscode and intellij) offer much better editor assistance for .proto files than they do for OpenAPI specs. And it's just more concise and readable.
Finally, gRPC's proto files can import each other, which offers a great code reuse story. And it has a standard library for handling a lot of common patterns, which helps to eliminate a lot of uncertainty around things like, "How do we represent dates?"
Next is the code generation itself. gRPC spits out a library that you import, and it's a single library for both clients and servers. The server side stuff is typically an abstract class that you extend with your own implementation. I personally like that, since it helps keep a cleaner separation between "my code" and "generated code", and also makes life easier if you want to more than one service publishing some of the same APIs.
OpenAPI doesn't really have one way of doing it, because all of the code generators are community contributions (with varying levels of documentation), but the two most common ways to do it are to generate only client code, or to generate an entire server stub application whose implementation you fill in. Both are terrible, IMO. The first option means you need to manually ensure that the client and server remain 100% in sync, which eliminates one of the major potential benefits of using code generation in the first place. And the second makes API evolution more awkward, and renders the code generation all but useless for adding an API to an existing application.
gRPC's core team rules the code generation with an iron fist, which is both a pro and a con. On the upside, it means that things tend to behave very consistently, and all the official code generators meet a very high standard for maturity. On the downside, they tend to all be coded to the gRPC core team's standards, which are very enterprisey, and designed to try and be as consistent as possible across target languages. Meaning they tend to feel awkward and unidiomatic for every single target platform. For example, they place a high premium on minimizing breaking changes, which means that the Java edition, which has been around for a long time, continues to have a very Java 7 feel to it. That seems to rub most people (including me) the wrong way nowadays.
OpenAPI is much more, well, open. Some target platforms even have multiple code generators representing different people's vision for what the code should look like. Levels of completeness and documentation vary wildly. So, you're more likely to find a library that meets your own aesthetic standards, possibly at the cost of it being less-than-perfect from a technical perspective.
For my part, I came away with the impression that, at least if you're already using Envoy, anyway, gRPC + gRPC-web may be the least-fuss and most maintainable way to get a REST-y (no HATEOAS) API, too.
As much as I love a well-designed IDL (I'm a Cap'n Proto user, myself), the first thing I reach for is ReST. It most cases, it's sufficient, and in all cases it keeps builds simple and dependencies few.
Wow, yes! Yes!
IDLs represent substantial complexity, and complexity always needs to be justified. Plain, "optimistically-schema'd" ;) REST, or even just JSON-over-HTTP, should be your default choice.
For a newcomer having `message Thing {}` and `service ThingChanger {}` is very approachable because it maps directly into the beginner's native programming language. If they started out with Python/C++/Java you can say "It's like a class that lives on another computer" and they instantly get it.
Does grpc-web work well? Is there a way to skip the proxy layer and use protobufs directly if you use websockets?
[1]: https://github.com/grpc/grpc-web [2]: https://github.com/grpc/grpc-web
I guess the big advantage is that when you write a manual query you can still pull down more data than you need by accident. Whereas this approach only pulls down what you need.
I never did use the feature, having got tired of using Thrift for other reasons (e.g. poor interoperability Java/Scala).
REST can also return protobufs, with content type application/x-protobuf. Heck, it can return any Content-Type. It doesn't have to be confined to JSON.
GRPC needs to support the language you're using. It does support a lot of the popular languages now. But most languages have some sort of http server or client to handle REST.
Often I’ll want much more control over caching and cache invalidation than what you can do with HTTP caching.
I’d be interested to see an analysis of major websites usage of HTTP caching on non-static (i.e. not images, JS, etc) resources. I bet it’s pretty minimal.
For example you want to make sure large resource blobs (e.g. pictures are cached).
In this case you can decide to not put them in the GraphQl response but instead put a REST uri of them there and then have a endpoint like `/blobs/<some-kind-of-uid>` or `/blobs/pictures/<id>` or similar.
The same endpoints also can be used for pushing new resources (GraphQl creates a "empty" `/blobs/<id>` entry to which you than can push).
While this entails an additional round-trip and is properly not the best usage for some cases for others like e.g. (not small) file up-/down-load it is quite nice as this is a operation often explicitly triggered by a user in a way where the additional roundtrip time doesn't matter at all. An additional benefit is that it can make thinks like halting and resuming downloads and similar easier.
Also, with gzip, the size of json is not a big deal given redundant fields compress well.
Apollo support link: https://www.apollographql.com/docs/apollo-server/performance...
That being said some of those advanced use cases may be off by default in Apollo.
Strictly speaking, that's not what REST considers "easily discoverable data". That endpoint would need to have been discovered by navigating the resource tree, starting from the root resource.
Roy Fielding (author of the original REST dissertation): "A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). (...) Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC’s functional coupling].
A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API). "[1]
1. https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypert...
Edit: Pretty much every REST API I see these days explains how to construct your URLs to do different things - rather than treating all URLs as opaque. Mind you having tried to create 'pure' HATEOAS REST API I think I prefer the contemporary approach!
- need an extra step when doing protoc compilation of your models
- cannot easily inspect and debug your messages across your infrastructure without a proper protobuf decoder/encoder
If you only have Go microservices talking via RPC there is GOB encoding which is a slimmed down version of protocol buffer, it's self describing, cpu efficient and natively supported by the Go standard library and therefore probably a better option - although not as space efficient. If you talk with other non-Go services then a JSON or XML transport encoding will do the job too (JSON rpc).
The graphQL one is great as what is commonly known as 'backend for frontend' - but inside the backend. it makes the life of designing an easy to use (and supposedly more efficient) API easier (for the FE) but much less so for the backend, which warrants increased implementation complexity and maintenance.
the good old rest is admittedly not as flexible as rpc or graphql but does the job for simpler and smaller apis albeit I admit i see, anecdotally, it being used less and less
The protobuf stuff can start to pay off as early as when you have two or more languages in the project.
Having used gRPC in very small teams (<5 engineers touching backend stuff) I had a very different experience from yours.
> need an extra step when doing protoc compilation of your models
For us this was hidden by our build systems. In one company we used Gradle and then later Bazel. In both you can set it up so you plop a .proto into a folder and everything "works" with autocompletes and all.
> cannot easily inspect and debug your messages across your infrastructure without a proper protobuf decoder/encoder
There's a lot of tooling that has recently been developed that makes all of this much easier.
- https://github.com/fullstorydev/grpcurl
- https://github.com/uw-labs/bloomrpc
You can also use grpc-web as a reverse proxy to expose normal REST-like endpoints for debugging as well.
> If you talk with other non-Go services then a JSON or XML transport encoding will do the job too (JSON rpc).
The benefit of protos is they're a source of truth across multiple languages/projects with well known ways to maintain backwards comparability.
You can even build tooling to automate very complex things:
- Breaking Change Detector: https://docs.buf.build/breaking-usage/
- Linting (Style Checking): https://docs.buf.build/lint-usage/
There's many more things that can be done but you get the idea.
On top of this you get something else that is way better: Relatively fast server that's configured & interfaces with the same way in every programming language. This has been a massive time sink in the past where you have to investigate nginx/*cgi, sonic/flask/waitress/wsgi, rails, and hundreds of other things for every single language stack each with their own gotchas. gRPC's ecosystem doesn't really have that pain point.
With REST you can just use a normal HTTP caching proxy for all the GETs under certain paths, off the shelf.
Using a hybrid (JSON-RPC for writes and authenticated reads, REST for global reads) would have saved me a lot of time spent building and maintaining a JSON-RPC caching layer.
There is benefit to a GET/POST split, and JSON-RPC forces even simple unauthenticated reads into a POST.
The other issue with JSON-RPC is, well, json. It's not the worst, but it's also not the best. json has no great canonicalization so if you want to do signed requests or responses you're going to end up putting a string of json (inner) into an key's value at some point. Doing that in protobuf seems less gross to me.
That is explicity cache the information in your JavaScript frontend or have your backend explicitly cache. In that way it is easy to understand and your can also control what circumstances a cache is invalidated.
When you're many devs, many APIs, many resources, it really pays to have a consistent, well-defined way to do this. GraphQL is very close to what you've described, with some more defined standards. GRPC is close as well, except the serialisation format isn't JSON, it's something more optimised.
As a team grows these sorts of standards emerge from the first-pass versions anyway. These just happen to be pre-defined ones that work well with other things that you could choose to use if you wanted to.
For example:
POST /api/listPosts HTTP/1.1
{ userId: "banana", fromDate: 2342342342, toDate: 2343242 }
Reponse: HTTP/1.1 200 OK
[ { id: 32432, title: "Happy banana", userId: "banana" }, ... ]
Or in case of an error: HTTP/1.1 500 Internal Server Error
{ type: "class name of exception raised server side", message: "Out of bananas" }
The types can be specified with TypeScript needed.GET /api/module/method?param1¶m2
or
POST /api/module/method Body: Json{ param1, param2 }
1. Hashes queries and uses GET requests.
2. If missing, sends a standard graphql as POST
The strength and real benefit of GraphQL comes in when you have to assemble a UI from multiple data sources and reconcile that into a negotiable schema between the server and the client.
Edit: claiming gql solves over/underfetch without mentioning that you're usually still responsible for implementing it (and it can be complex) in resolvers is borderline dishonest.
It solves gRPC's inability to work nicely with web browsers.
It's nice that you don't have to do any translation. Just gRPC in/out of the browser.
It directly addresses the cons mentioned in the article while retaining all the pros
I think you can probably formalize JSON API schemas in a useful way, but JSON API ain't it.
We have added so many layers and translations between our frontend and database.
Graphql brought us closer and it starts to run into some of the security concerns already.
What if someone just made this direct sql interface safe/restricted?
I've been in multiple shops where REST was the standard -- and while folks had interest in exploring GraphQL or gRPC, we could not justify pivoting away from REST to the larger team. Repeatedly faced with this `either-or`, I set out to build a generic app that would auto provision all 3 (specifically for data-access).
I posted in verbose detail about that project a few months ago, so here I'll just provide a summary: The project auto provisions REST, GraphQL & gRPC services that support CRUD operations to tables, views and materialized views of several popular databases (postgres, postgis, mysql, sqlite). The services support full CRUD (with validation), geoquery (bbox, radius, custom wkt polygon), complex_resources (aggregate & sub queries), middleware (access the query before db execution), permissions (table/view level CRUD configs), field redaction (enable query support -- without publication), schema migrations, auto generated openapi3/swagger docs, auto generated proto file.
I hope this helps anyone in a spot where this `versus` conversation pops up.
original promotional piece: https://news.ycombinator.com/item?id=25600934
docker implementation: https://github.com/sudowing/service-engine-template
youtube playlist: https://www.youtube.com/playlist?list=PLxiODQNSQfKOVmNZ1ZPXb...
It's massively useful to know exactly what 'type' a received payload is, as well as to get built-in, zero-boilerplate feedback if the payload you construct is invalid.
With gRPC you're absolutely correct. Then again, I wasn't trying to make an argument for or against any of these technologies per se - just pointing out that this is a major part of the value provided that wasn't really called out in the article.
If you're using typed languages on either the client or the server, having a serialization system that preserves those types is a very nice thing indeed.
because nobody has ever done this with protobuf...
btw what is the protobuf standard type for "date"?
[0] https://developers.google.com/protocol-buffers/docs/referenc...
You can, of course, do the thing that JS requires you always do and put an ISO8601 date in a string. This has the benefit of storing the offset data in the same field/var.
Javascript needs int64s as strings, I believe, because the JS int type maxes out at 53 bits.
Completely solves the PUT verb mutation issue and even allows event-sourced like distributed architectures with readability (if your patch is idempotent)
I married it to mongoose [2] and added an extension called json-patch-rules to whitelist/blacklist operations [3] and my API life became a very happy place.
I've replaced hundreds of endpoints with json-patch APIs and some trivial middleware.
When you couple that stack with fast-json-patch [4] on the client you just do a simple deep compare between a modified object and a cloned one to construct a patch doc .
This is the easiest and most elegant stack I've ever worked with.
[2] https://www.npmjs.com/package/mongoose-patcher
I used to write protocol buffer stuff for this reason. But I realized after some time that compressed json is almost as good if not better depending on the data, and a lot simpler and nicer to use. You can consider to pre-share a dictionary if you want to compress always the same tiny messages. Of course json + compression is a bit more cpu intensive than protocol buffers but it's not having an impact on anything in most use cases.
Between soap and grpc, grpc is the better choice at this point. It’s simpler to implement, and has decent multi-language support.
For now I will stick to SignalR
Sounds like write it yourself using two streams
https://web.archive.org/web/20210315144620/https://www.danha...