So the queue keeps track of the high-watermark on a per consumer basis and all the consumer has to do is show up, tell the queue its deterministic name/id (might be driven by imaging, configuration, or SDN), and the queue will serve up the next new item that consumer hasn't seen yet.
This would be handy for really dynamic transient worker topologies because it keeps the mutable state and state tracking concerns entirely outside the transient worker.
That said, I still wouldn't use LevelDB. Unless I was expecting to do multi-attribute range queries or something (now we're well outside queue territory), but even then you're still folding over the data for knowable start/end markers and a linear scan over a binary term file will be faster than the multiple seeks + segment scans that LevelDB requires.
So it's either unreliable or slow.
Also if you have dynamic transient worker topologies, you have to remember those positions. You are saving data for later use, that may never arrive. How long do you keep this data?
Seems like a pretty messy way of doing things.
Completely agree about LevelDB.
But moving the concern to the consumer to track the cursor doesn't make the protocol any more stable. To keep a stable cursor, the consumer would need to persist that someplace, which just pushes the acknowledgement to that persistence component instead. If a stable cursor is what you're after, then co-locating it with the durable queue provides a simpler solution with a slightly better consistency guarantee.
The garbage collection problem is a real one, but realistically how many consumers is an infrastructure service like this going to have? Tens? Hundreds? Thousands? Millions? Billions?
No matter which one of those you pick it's a trivially small secondary index to maintain even if you never reaped it. I mean it's a K/V problem (consumer_id -> queue_offest) and there's a K/V store already sitting there. If you didn't want it to grow forever then you could establish a TTL policy via configuration.
The problem you would have is consumers that don't have stable or bounded id's. Like a system that assigns a new id every time the consumer makes a request or the consumer is restarted.