In the case of autodiff, you do presumably know the exact computations that are done, there just might be so many of them that it’s infeasible to work it out analytically.
It depends on your requirements, so I’m not sure if this suggestion will work for you, but one strategy to consider would be to build the error bound computation as a function into your math operations. It’s relatively much easier to compute error bounds than it is to write an expression for them or to prove them. That strategy won’t give you conservative bounds and if your input is non-deterministic, the answer will vary on every run. But you could sample your error bounds enough times to have some confidence in the statistical answer.
I’m assuming in both paragraphs above that you have control over the autodiff implementation and can modify it. If that’s not true, if it’s not yours and not open source, then the only alternative is to ask the maintainer.
IIRC this is what the PBR book does, it weaves an error bound function into the base class of a math operation and then you can query the error from a parse tree of different math ops, or something like that.