However, I don't want him to JUST play games 24/7 but for him to do something productive like learning how to code. And I'll meet with him once per week to check up onhim.
My dilemma is that AI has completely changed how I personally code(and learn things), and all of the existing courses/curriculums for kids just feel a bit archaic...
Obviously, I can't just tell my nephew to go talk to chatgpt all day long and learn, without a project plan/target, he's gonna drift off(as I would too). I'm also thinking, since my main goal here is for him to just learn, maybe we should even just start with basic math/science instead of straight into coding?
I don't have any kids myself, so I'm curious to learn how all of the parents on HN are educating their kids today! Anything is helpful :)
From what I understand, the leap from GPT-2 to GPT-3 was mostly about scaling - more compute, more data. GPT-3 to 4 probably followed the same path.
But in the year and a half since GPT-4, LLMs have gotten significantly better, especially the smaller ones. I'm consistently impressed by models like Claude 3.5 Sonnet, despite us supposedly reaching scaling limits.
What's driving these improvements? Is it thousands of small optimizations in data cleaning, training, and prompting? Or am I just deep enough in tech now that I'm noticing subtle changes more? Really curious to hear from people who understand the technical internals here.
I'm vaguely aware of the fact that training is a lot more complex than inferencing, so the fruits of further optimizing hardware are not as low hanging nor rewarding.
But nevertheless, I'm curious to hear a bit more about the details from folks that work in those fields. Is it worth it? Is it possible? What's the complexity here?
That all makes sense to me and I think is the right direction to be headed. However, it's been a bit since the inception of some of these projects/cool demos but I haven't seen anyone who uses agents as a core/regular part of their workflow.
I'm curious if you use these agents regularly or know someone that does. Or if you're working on one of these, I'd love to know what are some of the hidden challenges to making a useful product with agents? What's the main bottle neck?
Any thoughts are welcome!
Is RAG is still the way to go? Should one fine tune the model on top of that data as well?
It seems that getting RAG to work well requires a lot of optimization. Are there many drag n drop solutions that work well? I know the open AI assistant API has a built-in knowledge retrieval, anyone has experience how good that is compared to other methods?
or is it better to pre train a custom model and instruct train it?
Would love to know what you guys are all doing!