Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
popcorncowboy
1y ago
0 comments
Share
> Developers can run inference on Llama 3.1 405B on their own infra at roughly 50% the cost of using closed models like GPT-4o
Does anyone have details on exactly what this means or where/how this metric gets derived?
0 comments
default
newest
oldest
rohansood15
1y ago
I am guessing these are prices on services like AWS Bedrock (their post is down right now).
PlattypusRex
1y ago
a big chunk of that is probably the fact that you don't need to pay someone who is trying to make a profit by running inference off-premises.
j
/
k
navigate · click thread line to collapse