Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
0 points
grepfru_it
9mo ago
0 comments
Share
I am curious the need for 70 t/sec?
undefined | Better HN
0 comments
default
newest
oldest
Aeolun
9mo ago
Waiting minutes for your call to succeed is too frustrating?
ekianjo
9mo ago
Depends entirely on the use case. Not every LLM workflow is a chatbot
jbellis
9mo ago
no, but if you're not latency sensitive you should probably be using DeepSeek v3 (cheaper than flash, significantly smarter)
1 more reply
cootsnuck
9mo ago
High concurrency voice AI systems.
grepfru_it
OP
9mo ago
Why are you self hosting that?
j
/
k
navigate · click thread line to collapse