Do we want knowledge communities like Stack Overflow or Reddit to continue to exist? Should big AI providers that train on their data share some of the value back to the community? Is there an ethical way for web communities to license data to AI providers?
I hope the answer is yes and that there is a path to a productive partnership, one that allows public communities where knowledge is shared freely to thrive, while also bringing more grounded and vetted content to AI systems that are often closed and require a subscription to access.
How would they do that? So far, the LLMs can't be trusted to produce accurate answers. The AI companies can pay money to the data sources, but they can't really offer back anything useful (yet, imho).
[1]: https://arstechnica.com/ai/2024/02/reddit-has-already-booked...
They offered a convenience by burning money and the mismanagement and pre IPO shenanigans certainly are not helping.
They don"t own the content and the communities or the user base that can move from irc, aol, to discord. What did they learn from dead communities of the past? Do they sell traction and convenience that they own or are they claiming they sell content which they don"t own? Users curated the content, and most effective mods have left. The content in large parts of their sites has become stale or degraded long before OpenAI existed. Graveyard communities cannot curate nor pay for server costs.
AI is convenient curation and people are paying for the convenience. AI sites are also losing money per click with server costs that are out of the galaxy compared to the cached html and elastcsearch serving crowd.
We have seen the ridiculousness of the AI sites' attempts to introduce management features, short of putting penguins in the desert for animal diversity. But the great teachers of bad management features were reddit and stackoverflow who also actively killed community developed management modules.
They are failing because they lack basic understanding of the teachings of centuries of civil society and they make up what is right, wrong or politically correct ad hoc based on marketing. Just trying to avoid bad publicity that could scare off potential IPO crowd only introduces community debt and grievance. That is what has been killing them.
Wikipedia has not been crying foul but has been curating the most quality content for AI but on a low cost setup for its size. I just think its better to donate there content and money,
No, we want better knowledge communities to exist and for Stack Overflow and Reddit to cease to exist.
There are often clues in the comments that are more helpful than the “answer” and often outdated answers have the most votes.
All this to ask, how on earth is something like an LLM expected to reconcile those issues.
Though I suppose that is a short sighted concern in itself given the way in which we work will begin to evolve quickly as AI becomes more powerful.
In the end, I guess they end up being a positive press story for Google / Open AI / etc.?
Isn't it the knowledge of its users?
Expect a lot of enriched answers like these coming soon from Gemini/bard :-p
these are literally questions I've given to project managers to help create better requirements but ultimately as a dev you have to come up with "something" regardless and redo the work once the customer complains. Stupid GPTs cutting the line!
Overall this all feels so unimaginative. With all the resources these companies have the only solution they can come up with for the search problem is "just throw AI at it." I could come up with that. It's not clever.
I've basically completely replaced Google in my day-to-day unless i need to look up a specific location of something in the physical world or something that recently happened.
That's ...not good.
GPTx gets alot of surface topics right but when you delve into gritty specific details it will just start rambling like a straight jacket lunatic with the confidence of a used car salesman. The rubber meets the road when i try to compile code that uses libraries or functions that don't exist or it leads me to hallucinated imaginary github repos. I worry that this use of GPTx would be like getting water from lead pipes: it would seem fine on the day-to-day while my mind is slowly poisoned with nonsense and insanity.
Google has certainly taken a nosedive in result quality for sure the last few years but Kagi has been amazing for me lately.