I'm the VP of engineering in a mobility company operating in Europe and in the US. email: mickey at hey
If you pass by Paris and want to have a drink shoot me an email.
Over time, the number of exceptions in our infrastructure grew, the number of 3rd-party dependencies also grew, the traffic to our website increased 10X, and as a result the quantity of exceptions has made it difficult to parse the signal from the noise.
My question is: how do other engineering teams manage their exceptions? I don't want to ignore exceptions that could be real issues that we should be fixing and I also don't want to litter our code with code that does nothing other than prevent non-harmful, non-user generated exceptions as that would be layering on complexity to our codebase.
We tried a new approach recently, based on a spreadsheet + a few lines of Google app script. Basically we elect a “Bug Master” every week whose responsibility is to sort the bugs. This worked ok for a while but engineers don’t all have the same involvement, aren’t always available, they might be in a rush to release something, busy with meetings, etc. And in the mean time, exceptions stack up. Potentially, important ones.
Any insights would be greatly appreciated as we're trying to find a more sustainable and scalable way for our team to handle exceptions.