Interesting - have you tried sending the image and 'hallucinated' text together to a review LLM to fix mistakes?
I don't have a use case of 100s or 1000s of hand-written notes have to be transcribed. I have only done this with whiteboard discussion snapshots and it has worked really well.