What is the size of this pool, ie how many GPUs would it take for an individual user to be able to run their own equivalent today? Let's assume the LLM is fully downloadable.
I ask, because, if LLMs stop improving exponentially, surely soon enough we will ALL be able to run un-quantised local LLMs of sufficient quality for day to day tasks.
What do my fellow HN users think? Should they be autoflagged or marked as paywalled?
Should contributors post a summary that they have themselves written, and add the link elsewhere in their post for those interested? I do think that a smidgen of extra effort would be valued greatly by all.
I know there is HN guidance on this, and I have previously and regretfully posted paywalled articles, but keen on seeing is anyone has any new or better ideas.