This doesn't happen at a high rate, but it happens more than zero times per week for us. We pay for Sidekiq Pro and have superfetch enabled so we are protected. If we didn't do so we'd need to create some additional infra to detect jobs that were never properly run and re-run them.
[1] https://gitlab.com/gitlab-org/ruby/gems/sidekiq-reliable-fet...
[2] https://redis.io/commands/rpoplpush/#pattern-reliable-queue
I'm still confused about what you're saying though. You're saying that the language of "enhanced reliability" doesn't reflect losing 2 jobs over about 50*7 million (from your other comment)?
And that if you didn't pay for the service, you'd have to add some checks to make up for this?
That all seems incredibly reasonable to me.
It’s hard to get this right though. No matter where the line gets drawn, free users will complain that they don’t get everything for free.
I guess you can say that hardware issues on your host aren't under your control, but it's under your control to find a host that doesn't have these issues. And not even a full-on ACID database is going to be 100% reliable if you yank the power cord at the wrong moment.