Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
grepfru_it
1y ago
0 comments
Save
Share
I am curious the need for 70 t/sec?
0 comments
5 comments · 2 top-level
top
newest
oldest
Aeolun
1y ago
· 2 in thread
Waiting minutes for your call to succeed is too frustrating?
ekianjo
1y ago
Depends entirely on the use case. Not every LLM workflow is a chatbot
jbellis
1y ago
no, but if you're not latency sensitive you should probably be using DeepSeek v3 (cheaper than flash, significantly smarter)
1 more reply
cootsnuck
1y ago
· 1 in thread
High concurrency voice AI systems.
grepfru_it
OP
1y ago
Why are you self hosting that?
j
/
k
navigate · click thread line to collapse