But I'm curious about the actual mechanics: How exactly does this feedback loop work? When I accept, reject, or modify the code that these models spit out, is that signal fed directly back into training?
Not necessarily against this, just genuinely curious about how the sausage is made.