RPC Olympics – The Search for the Perfect RPC Protocol (opens in new tab)

(perimeterx.com)

88 pointsamirshk806y ago70 comments

70 comments

61 comments · 23 top-level

pkage6y ago· 7 in thread

Strong recommendation for ZeroMQ as well. It's a thin layer on TCP, is very fast, offers bindings in many languages, and is fairly flexible as well.

https://zeromq.org/

alfalfasprout6y ago

ZeroMQ and its bindings are well-proven, extremely well written, and the ZeroMQ guide is a work of art. That said, the bindings are not created equal. We've seen some unreliability with the Java package even though the official C/C++ and Python bindings work perfectly (the Go package too).

X-Istence6y ago

The thin layer on TCP leaves a lot of things to be desired. Such as knowing if the other end has disappeared and won't ever come back.

Last time I used ZeroMQ was ~7 months ago at my previous job, so maybe things have changed since then, but having to write hacks to see if the other side is still there or not is absolutely terrible.

Also, the Python bindings did not like fork() without exec, which yes is bad and all, but is still something that is done every so often in real world large programs.

excerionsforte6y ago

I second this recommendation. I use MsgPack over ZeroMQ with a custom RPC protocol. Protobuf, Capn Proto, Flatbuffers are good serialization mechanisms as well. Can't wait until the new Socket Types are out of draft stage.

I believe GRPC is pluggable so if one wanted to invest time in building a GRPC ZeroMQ Transport then that is a feasible route. GRPC bring really good RPC mechanisms to the table.

nurettin6y ago

especially when you use the router pattern on server side in order to have multiple sessions on the same socket.

nerdponx6y ago

Got any references on the router pattern? I searched but came up with a lot of IT related articles.

1 more reply

hhas016y ago

Surprised article didn’t include this in its comparison. Seems an obvious contender.

pjc506y ago

Marketing, I guess: message queueing and RPC are conceptually quite different, although it looks like zeromq has a request-response mode designed for this.

alfalfasprout6y ago· 6 in thread

Having just looked into this recently, I can also safely say that moving to REST HTTP/2 on its own already provides a significant speed benefit. Closing the gap to GRPC (or even substantially beating it) is possible by switching to a fast serialization format (eg; flatbuffers, msgpack, capnproto).

While GRPC has its place it also comes with headaches like pretty lousy generated interfaces, horrible debuggability, and unpredictable scaling.

gravypod6y ago

Came here to say something similar. The interfaces and tooling around gRPC in the public sphere is pretty bad. If you maintain a code base with multiple languages and want to compile gRPC for it you can choose between:

1. Having a complex build system that is aware of gRPC

2. Massive migraines

Unfortunately build systems integrations, IDE integrations, and generated code is unanimously awful for gRPC.

Every company I've seen use grpc has unfortunately adopted the practice of basically manually running protoc and committing the generated code into repo - sometimes modifying the code manually to make it import successfully (Python).

I hope Bazel evolves into a state where it's usable by the average engineer and has first class support for gRPC.

pjc506y ago

Not really - we have a product that builds C#/Java/Python/C++ versions of a grpc API, and we just do it by invoking 'protoc' at the relevant point?

treve6y ago

It's kinda weird to compare REST to RPC protocols though. It's a completely different paradigm. Maybe you're just building a HTTP API without the REST part? (JSON-RPC is a thing)

e12e6y ago

Actually traditional RPC is one of the paradigms considered in Fielding's thesis coining the term. I don't know why people seem to insist there are no other models than REST if you happen to have browser on one end. Sending Javascript down the asynchronous pipe designed for XML was at the very least a step back away from REST toward something more traditional (moving code, not data).

Section 3.5 "Moving code styles" (and 3.4 for RPC):

https://www.ics.uci.edu/~fielding/pubs/dissertation/net_arch...

hhas016y ago

Indeed; true REST interactions tend to be quite coarse-grained; essentially requests to update a remote state machine to match the described state. Most of what gets called “REST” isn’t; it’s just ad-hoc RPC sent over HTTP with arguments encoded as XML/JSON, which is probably what parent really means. Take casual claims of RESTfulness with the requisite bucket of salt. (If the term “REST API” is used, you can toss the bucket entirely as the very phrase is itself an oxymoron.)

pdpi6y ago

HTTP is a reasonable enough transport layer for RPC frameworks (e.g. gRPC is built on top of HTTP/2)

parhamn6y ago· 5 in thread

Cool! I don't know if a comparison of these really makes sense. You'd be conflating transport, serialization, and stream format (unary/stream).

For fullstack dev, I've been immensely happy with grpc/protobuf over http because of the type safety I get communicating between Golang and Typescript. This eliminates a whole class of bugs but is only a serialization benefit.

Generally: use the thing with the best tooling!

CameronNemo6y ago

I'd love to learn more about your stack, use cases, and development workflow. My employer is starting a golang, grpc/protobuf, and sveltejs web application from scratch and they are new technologies for almost everyone on my team!

I want to write my integration test suite for the back-end service in typescript to hopefully be able to reuse some of the test suite code for the front-end. But I'm struggling to set up a functional CI pipeline.

parhamn6y ago

Sure! We are actually currently building a platform on this stack (for eventual open sourcing) called Core [1]. Protobuf generally definitely needs some tooling and environment management work to make things smooth. But it is achievable and when you get there things work better a lot more often.

Drop me a line at parham@cloudsynth.com and happy to give guidance where I can.

[1] https://www.cloudsynth.com/products/core

thedance6y ago

There is no type safety in grpc or any other protobuf RPC scheme, full stop. The recipient of a message makes an assumption about the meaning of the message and decodes it accordingly. Any encoded protobuf might successfully decode as any other protobuf.

To repeat, grpc has no "type safety" whatsoever.

parhamn6y ago

Give me whatever terminology you prefer for: "I can ensure I maintain backwards compatibility (with a spec checker). Also with most of the generated client code you can ensure to a high degree of safety that the data I'm accessing exists and is of the correct type".

Happy to consider using that term instead. For most folks, the one I chose is good enough.

thedance6y ago

Quick reminder that rebuttals are more valuable than downvotes.

1 more reply

nurettin6y ago· 3 in thread

My current rpc model for python is to pickle function name and args/kwargs and publish to redis for any subscribers that might be listening.

If the channel is marked as persisting, I put the message to a hashset with current nanosecond as the timestamp and just send a ping instead.

It is pretty fast with asyncio redis client clocking at an avg 15k rpc/s on a single core with uvloop on a lowly i7

JoshTriplett6y ago

Be careful using the pickle format; it's not considered safe to unpickle untrusted data. https://docs.python.org/3/library/pickle.html

nurettin6y ago

I did see the big warning labels everywhere. However, there is simply no replacement that is equally fast (protocol 5), easy to use (copyreg) and imports necessary modules when deserializing. So tradeoffs were made.

1 more reply

ngneer6y ago

You beat me to it. Sounds like you would be opening yourself up to a variant of CSRF. One user could upload untrusted data that would be fed into an unsuspecting user. You should never feed or consume an untrusted pickle.

1 more reply

the84726y ago· 3 in thread

If you're on the same machine then pipes, unix sockets and shared memory are even faster options.

fulafel6y ago

Those are transport layers, a layer beneath a RPC protocol.

numbsafari6y ago

It wouldn’t really be an RPC, then would it?

the84726y ago

It's not a normal procedure call within the same process.

1 more reply

mappu6y ago· 2 in thread

I performed the exact opposite migration at $DAYJOB recently - swapping an internal text-based protocol to HTTP (over a local IPC pipe).

The main benefit was we could suddenly reuse all the codegenerated routers/docs/authentication from the HTTP ecosystem. It significantly simplified/standardised our IPC layer and reduced the "weirdness" in the codebase.

pathseeker6y ago

Yep. Unless you're doing thousands of requests per second+, the right RPC protocol is likely just HTTP due to the massive improvement in mature tooling, debugging, etc.

cogman106y ago

Today's toasters can handle 1k http requests per second. :)

The overhead of Http (especially 2/3) matters so little on modern hardware.

lulf6y ago· 2 in thread

Couple of protocols that would be nice to benchmark for comparison:

* AMQP 1.0 - can also be used for RPC without a broker in between client and server. See https://qpid.apache.org/proton/

* Aeron - low latency, UDP based, see https://github.com/real-logic/aeron

zvrba6y ago

Re proton: I tried to use it in point-to-point mode, but haven't been able to figure out how; Javadoc reference is useless for that. There exist only Python examples but Python APIs don't map 1:1 to Java APIs.

lulf6y ago

For Java, have a look at vertx-proton which builds on top of proton-j and is a bit more intuitive (still not great) than proton-j for creating servers and clients. :)

Example “blocking” client https://github.com/EnMasseProject/enmasse/blob/master/amqp-u... , but might give an idea of how to set “dynamic source” required for rpc.

In general though I think the Qpid python and c++ examples might be better.

skrowl6y ago· 2 in thread

Damn it, I just learned gRPC and now we're doing something different?

This is just as bad as front-end web dev!

ngneer6y ago

For perspective, RPC was invented in the 50s and 60s.

heavenlyblue6y ago

So it was invented twice?

1 more reply

teddyh6y ago· 2 in thread

No mention of authentication or encryption? I can’t believe that everyone is either using HTTPS with either client certificates or plain HTTP authentication, or using IPsec (or some form of IP tunneling) to make encryption transparent to the application.

pjc506y ago

Don't know why this was downvoted - it's one of those things that's a huge pain to retrofit.

tenebrisalietum6y ago

That's a Layer 5 or 6 issue, or lower if you want to talk about IPSec, not a Layer 7 issue.

anonu6y ago· 2 in thread

Curious to hear anyone's experience with RPC over Nameko in Python. We use it extensively on top of RabbitMQ and its been pretty robust.

prakhunov6y ago

RPC over Rabbit can work (and does) but you do have to remember that it in the end it is a queue and a couple of bad messages can stop your entire system.

RPC over Rabbit is fine if you don't care about the result, or you can guarantee that each message gets processed in a short constant time.

monus216y ago

I have a similar implementation of RPC over AMQP in Node.js and 2 things i'd advise. (Similar to my sibling poster)

1) Don't requeue on error. One bad message could bring down your entire service. Better to just push it to Sentry and make a fix for it.

2) Have a timeout in your message handler.

silasb6y ago· 1 in thread

I recently spent some time between raw UDP and GRPC and learned a ton. This was for a hardware project so it's different from what web devs typically work on.

The most important thing that I found out while doing my research is it's not the fastest bytes that win. Well, that does matter, but it's not that important. It's reducing performance variance [1]. While my project wasn't optimizing for speed, it was optimizing for zero-copy de/serialization, but this often ends up as a solution for high-speed transfers. SBE, Flattbuffers, Cap'n Proto all had their places, but I ended up not using any of those and just hand-rolling something similar to what SBE would do. If this was a $DAYJOB project I'd probably end up doing something with SBE.

[1]: https://speice.io/2019/07/high-performance-systems.html

a_t486y ago

At my last job, I had a need for high bandwidth low latency messaging over UDP - we started out with raw C structs over a home built reliable UDP library. We ended up laying protobuf on top a year after because it was honestly a headache not having a serialization library as message types got more complex. It didn't end up causing enough slowdown to impact the run speed of the application (all serialization was done on the app thread rather than the very performance sensitive network thread) as we had other bottlenecks, and the developer workflow (and additional message validation) made it worth it. SBE looks neat, though, I didn't know about that one.

H8crilA6y ago· 1 in thread

Apples to oranges.

TCP has no concept of a message, you need to build it on top of TCP.

Which HTTP does, but it doesn't have a standard concept of a message format, you need to build it top of HTTP.

Which gRPC does, but it closes the connection after a successful exchange.

Which is avoided by streaming gRPC - makes sense if you know you'll be talking more over this channel.

None of those protocols are really comparable, and most of the differences boil down to the serialisation protocol (binary/proto or textual, like JSON).

tweenagedream6y ago

Usually grpc connections are left open after a message is sent and received. And since it's based on h2, multiple message streams can be multiplexed over a single connection. That is part of the benefit of grpc, establish a long lived connection and send your messages as needed. Similar to using keep-alive with http.

EdSchouten6y ago· 1 in thread

I think it comes to no surprise that sending data through a plain TCP socket is faster than using gRPC. The only downside is that over time requirements will likely start to stack up:

- Suddenly you want to enable TLS between one or more components, meaning you need to wrap the socket inside a TLS channel.

- You discover that your client behaves poorly when servers go offline, so you add your own logic for keepalives/pings.

- At some point you want to add metrics to all of this, so you decide to manually add Prometheus metrics to the client/server.

- Later on you want to attach OAuth2 tokens to requests as well, so that you can do credential passing.

- In order to get more insight in your setup, you decide that you want to use this in combination with OpenTracing/Jaeger.

Once all of those features are added to your Redis-like protocol, you discover that you've basically reinvented gRPC... poorly.

fulafel6y ago

In the conclusions the author ends up recommending Redis protocol and the plain TCP conclusion is least attractive of all, so I think you and the article are agreement that plain TCP is not the way to go.

sparcotan6y ago· 1 in thread

Should include smf[0].

[0] https://github.com/smfrpc/smf

ysleepy6y ago

A niche cpp-only lib that is looking for a maintainer.

I would not consider using it for production code either.

wallyqs6y ago

For speed, NATS.io also has RPC semantics and protocol is fairly straightforward like Redis: https://docs.nats.io/developing-with-nats/sending/request_re... || https://docs.nats.io/nats-protocol/nats-protocol

dpipemazo6y ago

We did a similar analysis in the early days at Elementary after rolling something custom based on ZMQ. We would up creating Atom: https://github.com/elementary-robotics/atom

Atom is an easy, Redis Streams-based RPC that also emphasizes docker containerization of microservices. We support plug-ang-play serialization with msgpack and Apache Arrow currently supported and more on the roadmap. You can also send raw binary if you please.

Another nice thing about Redis is that if you're running microservices on the same instance you can connect to redis through a linux socket on tmpfs and bypass the TCP stack to get even better performance.

brozaman6y ago

This isn't a proper comparison because the writer compares a few things that aren't really the same...

Anyway the problem with comparing rpc protocols is that you need to do a per case benchmark if you really care about performance. Pretty much every decent solution will be better than another equally good solution depending on the use case.

Years ago, one of my customers told me he used to work doing high frequency trading in a bank, and the bank had several tailor made solutions for data serialization and RPC made for very specific cases, and they were just better than any generic solution.

flyinprogrammer6y ago

Folks might want to checkout RSocket: https://github.com/rsocket/rsocket-go

asimpletune6y ago

I wish capnproto was included in this.

otabdeveloper26y ago

No such thing as "RPC Protocol", there's a huge insurmountable gulf between "I want my Python and Javascript services to pass ad-hoc messages between themselves" and "I want to send C++ structs over the network with typesafety and minimal overhead".

pjc506y ago

The failure to use the same vertical scale for the graphs or provide an overlaid one is infuriating. So all the bar charts look identical and you have to read the scale to pick up subtle factor-of-ten differences?

MichaelMoser1236y ago

i think rpc solutions should be able to gain in performance by having host byte order/little endian as the wire format. I mean spending all those cycles on bit swapping is pointless if both endpoints are in host byte order.

seemslegit6y ago

Wait what ? Why is emulating a redis server (and to whom ?) is even a thing ?

j / k navigate · click thread line to collapse

70 comments

61 comments · 23 top-level

pkage6y ago· 7 in thread

Strong recommendation for ZeroMQ as well. It's a thin layer on TCP, is very fast, offers bindings in many languages, and is fairly flexible as well.

https://zeromq.org/

alfalfasprout6y ago

X-Istence6y ago

The thin layer on TCP leaves a lot of things to be desired. Such as knowing if the other end has disappeared and won't ever come back.

Last time I used ZeroMQ was ~7 months ago at my previous job, so maybe things have changed since then, but having to write hacks to see if the other side is still there or not is absolutely terrible.

Also, the Python bindings did not like fork() without exec, which yes is bad and all, but is still something that is done every so often in real world large programs.

excerionsforte6y ago

I believe GRPC is pluggable so if one wanted to invest time in building a GRPC ZeroMQ Transport then that is a feasible route. GRPC bring really good RPC mechanisms to the table.

nurettin6y ago

especially when you use the router pattern on server side in order to have multiple sessions on the same socket.

nerdponx6y ago

Got any references on the router pattern? I searched but came up with a lot of IT related articles.

1 more reply

hhas016y ago

Surprised article didn’t include this in its comparison. Seems an obvious contender.

pjc506y ago

Marketing, I guess: message queueing and RPC are conceptually quite different, although it looks like zeromq has a request-response mode designed for this.

alfalfasprout6y ago· 6 in thread

While GRPC has its place it also comes with headaches like pretty lousy generated interfaces, horrible debuggability, and unpredictable scaling.

gravypod6y ago

1. Having a complex build system that is aware of gRPC

2. Massive migraines

Unfortunately build systems integrations, IDE integrations, and generated code is unanimously awful for gRPC.

I hope Bazel evolves into a state where it's usable by the average engineer and has first class support for gRPC.

pjc506y ago

Not really - we have a product that builds C#/Java/Python/C++ versions of a grpc API, and we just do it by invoking 'protoc' at the relevant point?

treve6y ago

It's kinda weird to compare REST to RPC protocols though. It's a completely different paradigm. Maybe you're just building a HTTP API without the REST part? (JSON-RPC is a thing)

e12e6y ago

Section 3.5 "Moving code styles" (and 3.4 for RPC):

https://www.ics.uci.edu/~fielding/pubs/dissertation/net_arch...

hhas016y ago

pdpi6y ago

HTTP is a reasonable enough transport layer for RPC frameworks (e.g. gRPC is built on top of HTTP/2)

parhamn6y ago· 5 in thread

Cool! I don't know if a comparison of these really makes sense. You'd be conflating transport, serialization, and stream format (unary/stream).

Generally: use the thing with the best tooling!

CameronNemo6y ago

parhamn6y ago

Drop me a line at parham@cloudsynth.com and happy to give guidance where I can.

[1] https://www.cloudsynth.com/products/core

thedance6y ago

To repeat, grpc has no "type safety" whatsoever.

parhamn6y ago

Happy to consider using that term instead. For most folks, the one I chose is good enough.

thedance6y ago

Quick reminder that rebuttals are more valuable than downvotes.

1 more reply

nurettin6y ago· 3 in thread

My current rpc model for python is to pickle function name and args/kwargs and publish to redis for any subscribers that might be listening.

If the channel is marked as persisting, I put the message to a hashset with current nanosecond as the timestamp and just send a ping instead.

It is pretty fast with asyncio redis client clocking at an avg 15k rpc/s on a single core with uvloop on a lowly i7

JoshTriplett6y ago

Be careful using the pickle format; it's not considered safe to unpickle untrusted data. https://docs.python.org/3/library/pickle.html

nurettin6y ago

1 more reply

ngneer6y ago

1 more reply

the84726y ago· 3 in thread

If you're on the same machine then pipes, unix sockets and shared memory are even faster options.

fulafel6y ago

Those are transport layers, a layer beneath a RPC protocol.

numbsafari6y ago

It wouldn’t really be an RPC, then would it?

the84726y ago

It's not a normal procedure call within the same process.

1 more reply

mappu6y ago· 2 in thread

I performed the exact opposite migration at $DAYJOB recently - swapping an internal text-based protocol to HTTP (over a local IPC pipe).

pathseeker6y ago

Yep. Unless you're doing thousands of requests per second+, the right RPC protocol is likely just HTTP due to the massive improvement in mature tooling, debugging, etc.

cogman106y ago

Today's toasters can handle 1k http requests per second. :)

The overhead of Http (especially 2/3) matters so little on modern hardware.

lulf6y ago· 2 in thread

Couple of protocols that would be nice to benchmark for comparison:

* AMQP 1.0 - can also be used for RPC without a broker in between client and server. See https://qpid.apache.org/proton/

* Aeron - low latency, UDP based, see https://github.com/real-logic/aeron

zvrba6y ago

lulf6y ago

For Java, have a look at vertx-proton which builds on top of proton-j and is a bit more intuitive (still not great) than proton-j for creating servers and clients. :)

Example “blocking” client https://github.com/EnMasseProject/enmasse/blob/master/amqp-u... , but might give an idea of how to set “dynamic source” required for rpc.

In general though I think the Qpid python and c++ examples might be better.

skrowl6y ago· 2 in thread

Damn it, I just learned gRPC and now we're doing something different?

This is just as bad as front-end web dev!

ngneer6y ago

For perspective, RPC was invented in the 50s and 60s.

heavenlyblue6y ago

So it was invented twice?

1 more reply

teddyh6y ago· 2 in thread

pjc506y ago

Don't know why this was downvoted - it's one of those things that's a huge pain to retrofit.

tenebrisalietum6y ago

That's a Layer 5 or 6 issue, or lower if you want to talk about IPSec, not a Layer 7 issue.

anonu6y ago· 2 in thread

Curious to hear anyone's experience with RPC over Nameko in Python. We use it extensively on top of RabbitMQ and its been pretty robust.

prakhunov6y ago

RPC over Rabbit can work (and does) but you do have to remember that it in the end it is a queue and a couple of bad messages can stop your entire system.

RPC over Rabbit is fine if you don't care about the result, or you can guarantee that each message gets processed in a short constant time.

monus216y ago

I have a similar implementation of RPC over AMQP in Node.js and 2 things i'd advise. (Similar to my sibling poster)

1) Don't requeue on error. One bad message could bring down your entire service. Better to just push it to Sentry and make a fix for it.

2) Have a timeout in your message handler.

silasb6y ago· 1 in thread

I recently spent some time between raw UDP and GRPC and learned a ton. This was for a hardware project so it's different from what web devs typically work on.

[1]: https://speice.io/2019/07/high-performance-systems.html

a_t486y ago

H8crilA6y ago· 1 in thread

Apples to oranges.

TCP has no concept of a message, you need to build it on top of TCP.

Which HTTP does, but it doesn't have a standard concept of a message format, you need to build it top of HTTP.

Which gRPC does, but it closes the connection after a successful exchange.

Which is avoided by streaming gRPC - makes sense if you know you'll be talking more over this channel.

None of those protocols are really comparable, and most of the differences boil down to the serialisation protocol (binary/proto or textual, like JSON).

tweenagedream6y ago

EdSchouten6y ago· 1 in thread

I think it comes to no surprise that sending data through a plain TCP socket is faster than using gRPC. The only downside is that over time requirements will likely start to stack up:

- Suddenly you want to enable TLS between one or more components, meaning you need to wrap the socket inside a TLS channel.

- You discover that your client behaves poorly when servers go offline, so you add your own logic for keepalives/pings.

- At some point you want to add metrics to all of this, so you decide to manually add Prometheus metrics to the client/server.

- Later on you want to attach OAuth2 tokens to requests as well, so that you can do credential passing.

- In order to get more insight in your setup, you decide that you want to use this in combination with OpenTracing/Jaeger.

Once all of those features are added to your Redis-like protocol, you discover that you've basically reinvented gRPC... poorly.

fulafel6y ago

sparcotan6y ago· 1 in thread

Should include smf[0].

[0] https://github.com/smfrpc/smf

ysleepy6y ago

A niche cpp-only lib that is looking for a maintainer.

I would not consider using it for production code either.

wallyqs6y ago

dpipemazo6y ago

We did a similar analysis in the early days at Elementary after rolling something custom based on ZMQ. We would up creating Atom: https://github.com/elementary-robotics/atom

brozaman6y ago

This isn't a proper comparison because the writer compares a few things that aren't really the same...

flyinprogrammer6y ago

Folks might want to checkout RSocket: https://github.com/rsocket/rsocket-go

asimpletune6y ago

I wish capnproto was included in this.

otabdeveloper26y ago

pjc506y ago

MichaelMoser1236y ago

seemslegit6y ago

Wait what ? Why is emulating a redis server (and to whom ?) is even a thing ?

j / k navigate · click thread line to collapse