undefined | Better HN

0 pointsllm_trw1y ago0 comments

I'm not using generative models to fill in details not present in the original document. If there's a typo there then there will be a typo in the transcript. If you want to fix that then you can run another model on top of it.

0 comments

Lerc1y ago

I realise that. The point is that a user is implicitly committing to the baseline error rate that exists in whatever means by which the document was created. If any additional loss was insignificant in proportion to that error rate then it would be unreasonable to reject it on that basis.

yigitkonur351y ago

You're right. For my API that prepares PDFs for LLMs, fixing typos makes sense. But yeah, keeping original text is crucial for most OCR tasks.

j / k navigate · click thread line to collapse

0 comments

Lerc1y ago

yigitkonur351y ago

You're right. For my API that prepares PDFs for LLMs, fixing typos makes sense. But yeah, keeping original text is crucial for most OCR tasks.

j / k navigate · click thread line to collapse