I think the part I had trouble with was seeing why AD would give better results. I still don't fully grok it, but that AD works not necessarily on the full equation, but the equation that was evaluated at a spot, I think finally clicks some with me.
I don't get how it helps with some discontinuity problems, but I think I can get how those aren't as important in many contexts.