Running OCR on a document is twice more expensive than processing the output on the most expensive GPT offering. Intuitively, this was kind of unexpected for me. Only when I did some calculations on Excel that I realized it.
If you’re able to halve the pricing for Layout output then you’re unblocking lots of use cases out there.
I guess anything up to 5 ¢ per page would be acceptable. But I'm afraid my company wouldn't be a customer. We are in Germany and we deal with particularly protected private data, there is no chance that we would exfiltrate this data to a cloud service.
The models (currently) fit in 24gb vram sequentially with small enough batch sizes, so a local server with consumer grade gpus wouldn't be impossible.
Specific things like evidentiary use would want 100% but that's at a level where any document processing would be suspect.
What is the the typical range for error rate in PDF generation in various fields? Even robust technical documents have the occasional typo.