With most Machine Learning algorithms I used to get shapley values or other 'explainable AI' metrics (for a large cost compared to simple inference, yes), it's very unsettling and frustrating to work without them now on LLMs.
Kind of. Tesseract's confidence is just a raw model probability output. You could easily use the entropy associated with each token coming out of an LLM to do the same thing.