undefined | Better HN

0 pointscivvv3mo ago0 comments

The RAG models are very competent at programming. I am worried about my job as a SWE in the near future, but didn't the MIT paper about a week ago pretty much confirm that width-scaling the model is about to (or has already) stopped giving any measurable increase in quality because the training data no longer overfills the model?

Any authentic training data from pre-LLM's is assumed to have been used in training already and synthetic or generated data gives worse performing models, so the path of increasing its training data seems to be a dead end as well?

What is the next vector of training? Maybe data curation? Remove the low quality entries and accept a smaller, but more accurate data set?

I think the AI companies are starting to sweat a little, considering the promises they have made, their inability to deliver and turn a profit at its current state and the slowing improvements.

Interesting times! We are either all out of jobs or a massive market crash is imminent, awesome...

0 comments

1 comments · 1 top-level

cauefcr3mo ago

Different architectures, different RL training loops, maybe memory modules [1][2] as part of the architecture, focusing on efficiency, the giant troves of data we're generating by using claude code/gemini-cli/opencode, there's lots of research to be made.

[1] https://research.google/blog/titans-miras-helping-ai-have-lo... [2] https://github.com/deepseek-ai/Engram

j / k navigate · click thread line to collapse