Suggestion for a project: make a tool that, given a proto description and a file that contains concatenated proto messages stored as binary strings (sort of like RecordIO at Google) lets you run simple SQL queries on the data and extract a subset of the fields from messages matching a predicate, and maybe even do simple aggregations. That was pretty handy. I really wish Google would open source some or most of this stuff. It’s not like keeping it closed source creates any kind of insurmountable competitive advantage, especially compared to the advantages that would accrue from broader adoption of protobufs.
- a tee loadbalancer for gRPC, forwarding the same requests to both A and B backend pools, but only returning results from A. I don't think Envoy has this, but it should.
- load balancing dashboards showing traffic between frontends and backends
- load balancer support for dynamic sharding
- gnubbyd under ChromeOS: https://groups.google.com/a/chromium.org/forum/m/#!msg/chrom... (I think most of this is doable these days, but the initial setup requires a Linux system)
- Kubernetes: server-specific custom hyperlinks on dashboards (e.g. links to POD_IP:PORT/stats, /debug, etc. for each individual pod you are looking at)
- Kubernetes: multiple Docker images in the same container or pod. E.g. the first container could be your code, while the second one might be data or the JVM runtime, etc., without having to bundle them together or doing costly copies in init containers.
- Kubernetes: canaries and automatic rollbacks
Envoy can do this, via its shadowing feature. See the docs here: https://www.envoyproxy.io/docs/envoy/v1.6.0/api-v2/api/v2/ro....
Hot off the presses: https://cloudplatform.googleblog.com/2018/04/introducing-Kay.... Though you have to use Spinnaker.
I would call that a "(live) traffic replayer" rather than a load balancer. "load balance" implies to me that the upstream traffic is divvied up among the downstream sinks, not that the upstream traffic gets copied to multiple downstreams.
Looks like some parts of it have escaped… https://github.com/eclesh/recordio
https://github.com/google/leveldb/blob/master/doc/log_format...
https://github.com/google/leveldb/blob/master/db/log_reader....
https://github.com/google/leveldb/blob/master/db/log_writer....
I think the decision not to open-source RecordIO is likely related to legacy baggage that's baked into the format. The LevelDB format above avoids that.
It doesn't appear that the headers for this are public though.
Why not use SQLite[1] for storing this data? Storing structured data in binary format, and being able to run SQL queries on it, is already possible with SQLite right?
The main thing stopping this endeavour is probably that to the best of my knowledge, there isn‘t any standardization in the Protobuf community about file formats serializing multiple of these together like RecordIO - that, and my C skills are pretty rusty by now :)
also the code is basically about not being a jerk to other people. seems like a low bar to meet.