Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
sushid
1y ago
0 comments
Save
Share
Is that not just traditional OCR applied on top of LLM?
0 comments
2 comments · 2 top-level
top
newest
oldest
energy123
1y ago
It's possible they have a software layer that does that. But I was assuming they don't, because the open source multimodal models don't.
maxlamb
1y ago
No it’s not, it’s a multimodal transformer model.
j
/
k
navigate · click thread line to collapse