undefined | Better HN

0 pointsKiro1y ago0 comments

Why all those steps? Why not just file + prompt to JSON directly?

0 comments

1 comments · 1 top-level

Having the text (for now) is still pretty important for quality output. The vision models are quite good, but not a replacement for a quality OCR step. A combination of Text + Vision is compelling too.

j / k navigate · click thread line to collapse