The exact reason it blows up isn't even necessarily all that important, other than in its effect on what you should be doing to reduce the probability of downtime. Well-engineered systems are routinely developed from less than completely reliable parts. Stuff fails, we design for it.
It's certainly not reason not to use it, if it's resulting in a net positive gain in your ability to get things done and maintain control and transparency over your deployed systems.
But it's certainly a good reason (among a long list of good reasons) to make sure you have a good backup routine in place, including regular testing of both their integrity and your ability to restore a working prod system from them quickly.
Distributed scalable automation will accidentally your data slightly more often. The more stuff you have the more edge cases and bugs you have.
Scale big fail big as I like to say.
docker volume prune says:
"Remove all unused local volumes. Unused local volumes are those which are not referenced by any containers"
If it removed a local volume that was being used by a container that is kinda bad.
2. Why are you running docker on ad-hoc machines you need to prune?
3. Why do you even need root access on production machines to fiddle around with docker commands?
While this is obviously a bad bug (and there are many with Docker), it seems more of an operational procedures failure than anything else. You could be saying:
“Beware of rm -rf /, it just deleted 20gb of production data”
Ok. Sure. But why are you tools and procedures putting yourself in a position to make that mistake?
We don't know his environment. We don't know his company's policies. We don't know his hardware, connectivity, or budget issues. These kinds of passive aggressive responses are almost never helpful.
That’s not a surprise, and maybe the issue at the core here is not really Docker. That’s all.
I've seen plenty of stuff in my career where I've gone on record to say "hey - we really shouldn't do this". Nothing got done about it. But hey, I did what I could.
Recently I learned about Rasmussen's dynamic safety model. I think this is a very handy mental model to have. It's the human factors that make what we do really hard. Often line level practitioners know better than they are allowed to do in practice and trying to fight organizational politics to Do The Right Thing can be an uphill battle.
They may have valid reasons to do that, even if not common.
export $WORKDIR=Home/me/proj
...
rm -rf /$WORKDIR
If something unsets $WORKDIR or does not set it at all, wave bu-bye to your everything. And before you say "who would do that?!" -- I believe I heard that happened to a build of RedHat that also had some kind of force push and auto-pull and build on their version control so every connected person had their version of the software nuked. If not for the non-connected individuals, the entire software would be gone apparently. Or so the legend goes.For batch processing, my usual pattern has always been to move the data from (slower) network storage to local storage, process, move results back to network storage.
So then Docker is designed to treat all of those as disposable.
I just searched "recette" and only came up with French cooking references.
I don't really know if it's designed like that but I treat them as disposable and unreliable so I must have a way to resuscitate the thing when something bad happens.
Perhaps a phonetic spelling of "resets" by a French person :)
It’s surprisingly easy with docker, especially when dealing with .... legacy systems.
The joys of open source users..
Flip side . . .
Computers are terrible. They can screw things up so bad it would require a thousand people to accomplish in the same amount of time.